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Dynamic  Weighted  Data  Structures 
Samuel  W.  Bent 

This  thesis  discusses  implementations  of  an  abstract  data  structure  called  a 
dynamic  dictionary.  Such  a  data  structure  stores  a  collection  of  items,  each  of 
which  is  equipped  with  a  key  and  a  weight.  Among  the  operations  we  might  wish 
to  perform  on  such  a  collection  are: 

(a)  accessing  an  item,  given  its  key 

(b)  inserting  a  new  item 

(c)  deleting  an  item 

(d)  joining  two  collections  into  one 

(e)  splitting  a  collection  into  two 

(f)  changing  the  weight  of  an  item 

Operations  (b)-{f)  provide  the  dynamic  nature  of  the  data  structure. 

In  addition  we  want  the  implementation  to  respect  the  weights,  so  that  ac¬ 
cessing  a  heavy  item  is  quicker  than  accessing  a  light  one.  In  an  optimal  binary 
tree,  the  path  length  to  an  item  of  weight  w  in  a  collection  of  total  weight  W  is 
proportional  to  log {W/w).  By  relaxing  the  optimality  constraint  and  Considering 
different  kinds  of  trees,  it  is  possible  to  retain  this  logarithmic  access  time  (with 
a  larger  constant  factor),  and  simultaneously  achieve  similar  logarithmic  times  for 
the  dynamic  operations. 

Two  new  data  structures  are  proposed,  biased  2-3  trees  and  biased  weight- 
balanced  trees.  They  achieve  the  logarithmic  time  bounds  provided  the  cost  is 
amortised  over  a  sequence  of  operations.  These  data  structure  have  applications  to 
the  network  flow  problem  and  to  the  design  of  “self-organising”  data  structures. 
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Chapter  1 

Introduction  and  Motivation 


One  morning  I  swoke  to  find  Sherlock  Holmes  intently  rearranging  note  cards  on  the 
breakfast  table.  As  his  friend  and  physician,  I  was  pleased  to  see  that  he  had  emerged 
from  the  lethargic  stupor  into  which  he  had  fallen  in  the  last  few  weeks;  it  meant  that  once 
again  he  had  found  a  project  worthy  of  his  great  intellect.  With  the  greatest  curiosity  as 
to  the  nature  of  Us  labor  I  greeted  him. 

“What  are  you  doing,  Holmes?" 

“Organising  my  master  criminal  index,”  he  said  without  removing  his  eyes  from  the 
table.  "I  find  it  a  matter  of  some  difficulty  to  arrange  these  cards  in  a  manner  suited  to 
my  needs.” 

I  glanced  at  the  cards  and  noticed  each  was  labelled  with  the  name  of  one  of  London's 
fiends,  followed  by  dates,  descriptions,  modi  operand i,  and  all  the  other  facts  Holmes  had 
gleaned  in  his  tenure  as  the  world’s  foremost  consulting  investigator. 

"Why  don't  you  just  arrange  them  alphabetically?”  I  asked. 

"Excellent,  my  dear  Watkins!  You  have  reduced  the  problem  to  (he  utmost  simplicity 
and  applied  the  soundest  logic  to  produce  a  solution.  However,  it  won't  do.” 

My  smile  fell  suddenly.  "Why  not?” 

"You  see,  Watkins,  you  would  treat  Moriarty  the  same  as  a  common  pickpocket. 
Somehow  I  want  to  place  the  cards  of  archvillains  such  as  Colonel  Moran  and  Irene  Adler 
in  a  more  prominent  position  than  those  of  the  minions  and  laborers  of  the  criminal  world. 
Yet  I  must  not  ignore  the  alphabet,  for  clues  often  appear  in  the  form  of  monogrammed 
handkerchiefs  or  Initials  in  correspondence.  Furthermore,  as  a  man’s  evil  star  ascends,  so 
should  his  card  achieve  greater  prominence,  wUch  should  then  fade  as  Lestrade  and  his 


I 


INTRODUCTION  AND  MOTIVATION 


colleagues  (with  whatever  humble  assistance  we  may  provide)  curb  the  activities  of  the 
offender.” 

“Why,  Holmes,  that  seems  impossible!”  I  cried. 

“Never  say  ‘impossible’,  Watkins.  Surely  man  is  clever  enough  to  overcome  the 
difficulties  nature  provides  him.” 

Just  then,  Mrs.  Hudson  ushered  into  the  room  the  man  who  introduced  us  to  the 
curious  case  of  the  Giant  Rat  of  Sumatra,  a  tale  I  hope  to  be  able  to  add  some  day  to  the 
public  record.  However,  our  conversation  over  that  breakfast  table  lingered  in  my  mind 
for  many  years,  and  I  often  tried  to  invent  a  simple  system  that  would  satisfy  Holmes’ 
requirements. 

To  state  the  problem  more  precisely,  Holmes  wishes  to  manage  collections  of  items. 
Each  item  contains  information  about  a  tingle  criminal,  most  of  which  is  unimportant  to 
the  organisation  of  the  index.  For  that  task,  all  that  is  important  is  the  name  of  the 
felon,  which  we  may  call  the  key  of  the  item,  and  the  importance  Holmes  attaches  to  him, 
which  we  may  call  the  weight  of  the  item.  Holmes  needs  to  gun  access  to  items,  to  split 
a  collection  into  smaller  pieces,  to  unite  collections  into  larger  ones,  to  change  the  weights 
of  items,  and  to  add  or  delete  items.  Each  of  these  operations  must  take  into  account  the 
weights;  they  should  proceed  very  quickly  on  the  more  important  items  at  the  expense  of 
proceeding  more  slowly  on  the  less  important  ones. 

In  short,  Holmes  wants  an  implementation  of  an  abstract  data  structure  called  a 
dynamic  dictionary,  in  which  the  cost  of  each  operation  is  a  function  of  the  weights  involved, 
both  of  the  operand  and  of  the  entire  dictionary. 


1.1.  Dynamic  dictionaries. 

A  dynamic  dictionary  is  an  abstract  data  structure  that  stores  a  collection  of  items. 
Each  item  may  have  a  number  of  attributes,  most  of  which  depend  on  the  application.  For 
the  purpose  of  maintaining  the  data  structure,  the  important  attributes  of  an  item  are  its 
key  and  its  weight 

The  keys  are  drawn  from  a  totally  ordered  set  K,  called  the  key  apace.  Typically  the 
keys  are  integers  or  alphabetic  strings,  with  the  usual  ordering  relation.  For  simplicity, 
we  will  assume  that  a  dictionary  can  store  only  one  item  with  a  particular  key.  (This 
assumption  can  be  dispensed  with  either  by  a  convention  about  equal  keys  or  by  enlarging 
the  key  to  disambiguate  equal  keys;  both  techniques  are  standard,  and  neither  affects  any  of 
the  implementations  to  be  discussed  here.)  In  some  applications,  the  keys  and  the  ordering 
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of  the  Heme  are  implicit  in  the  data  structure,  in  which  case  it  makes  no  sense  to  talk  about 
the  key  space  K  or  equal  keys. 

The  weights  are  real  numbers  strictly  greater  than  sero.  (In  some  applications  it 
may  be  necessary  to  restrict  the  weights  to  be  bounded  away  from  sero  by  choosing  some 
number  e  >  0  as  a  lower  bound.)  They  are  assigned  by  the  user  of  the  data  structure,  who 
presumably  has  chqsen  them  to  indicate  the  relative  importance  of  the  items.  The  manner 
in  which  an  implementation  is  expected  to  respect  the  weights  will  be  discussed  soon. 

In  the  context  of  a  particular  dictionary  D,  suppose  we  are  given  a  key  A.  The  item 
of  A,  denoted  I[K),  is  defined  to  be  the  item  in  D  whose  key  is  A*,  if  there  is  such  an  item; 
it  is  undefined  if  there  is  not.  If  the  item  of  A  is  defined,  it  is  unique,  since  D  may  contain 
at  most  one  item  with  key  A. 

The  weight  of  A,  denoted  W(K),  is  the  weight  of  the  item  of  A;  it  is  undefined  if  the 
item  of  A  is  also  undefined. 

Our  key  A  partitions  the  items  in  D  into  three  classes,  namely  those  items  whose 
keys  are  less  than  A,  those  items  whose  keys  are  greater  than  A,  and  the  item  of  A 
itself.  The  first  two  of  these  classes  are  called  the  left-items  of  A  and  the  right-items  of 
A,  respectively. 

If  the  key  of  each  item  in  D  is  less  than  the  keys  of  all  Items  in  another  dictionary  D1, 
we  say  that  D  precedes  D'.  This  relation  between  dictionaries  is  a  necessary  condition  for 
the  join  operation  to  be  well-defined  (see  below). 

With  this  terminology,  we  may  define  the  following  operations  on  a  dynamic  dictionary 
D  (or  on  a  pair  of  dictionaries  Di  and  D%): 

1.  ACCB88.  Given  a  key  A,  return  the  item  of  A,  or  an  indication  that  no  such  item 
exists. 

2.  join.  If  D\  precedes  D%,  construct  a  new  dictionary  Z>  containing  all  tbe  items  of  D\ 
and  D*.  and  discard  the  old  dictionaries,  (join  is  undefined  if  D\  does  not  precede 
Dt.) 

3.  split.  Given  a  key  A,  split  D  into  three  parts:  a  new  dictionary  D\  containing  the 
left-items  of  A,  the  item  of  A,  and  a  new  dictionary  D*  containing  the  right-items 
of  A. 

4.  delete.  Given  a  key  A,  discard  the  item  of  A  from  D. 

5.  promote.  Given  a  key  A  and  a  real  number  6,  add  S  to  the  weight  of  A. 

6.  demote.  Given  a  key  A  and  a  real  number  6,  subtract  6  from  the  weight  of  A, 
provided  that  the  resulting  weight  is  still  positive  (or  greater  than  the  lower  bound  c). 
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7.  INSERT.  Given  an  item,  add  it  to  D,  provided  that  its  key  is  different  from  all  keys  in 
D.  (If  D  already  holds  K,  do  nothing  or  signal  an  error.) 

Using  these  definitions,  we  get  a  data  structure  that  is  used  in  a  top-down  manner.  A 
typical  command  consists  of  a  key,  a  dictionary,  and  an  operation.  Before  the  operation 
can  be  carried  out,  we  must  first  search  the  dictionary  for  the  approporiate  item,  starting 
from  a  root  associated  with  the  name  of  the  dictionary.  The  search  will  involve  comparing 
the  given  key  with  keys  in  the  dictionary  and  making  decisions  based  on  the  total  ordering 
of  JC. 

An  alternative  way  to  use  a  dictionary  is  the  bottom-up  manner.  Instead  of  a  key, 
we  are  given  an  explicit  pointer  to  an  item,  so  we  need  not  do  any  searching.  Rather,  we 
apply  the  operation  directly  to  the  item,  in  the  context  of  whatever  dictionary  happens  to 
contain  it.  With  this  style  of  query,  the  access  operation  is  replaced  by  a  new  operation: 

8.  find.  Given  a  pointer  to  an  item,  return  the  name  of  the  dictionary  containing  that 
item. 

Since  no  searching  is  done,  there  is  no  need  to  have  keys  at  all.  The  ordering  among  items 
can  be  implicit  in  the  way  the  items  came  to  be  in  the  same  dictionary.  Whenever  we  join 
two  dictionaries  D\  and  (in  that  order),  we  simply  define  that  the  items  in  D\  precede 
the  items  in  Da. 

The  way  a  dictionary  is  used  depends  on  the  needs  of  the  application.  The  more 
familiar  manner  of  use,  and  the  one  assumed  by  the  algorithms  presented  here,  is  the  top- 
down  manner.  However,  some  important  applications  assume  bottom-up  use.  Fortunately, 
it  is  fairly  ample  to  adapt  the  top-down  algorithms  to  bottom-up  ones,  and  the  analyses 
presented  here  carry  through  with  little  change. 


1.2.  Performance  goals  and  entropy. 

The  definition  of  an  abstract  data  structure  says  nothing  about  how  the  operations 
should  be  implemented,  nor  about  how  fast  they  should  be.  However,  the  user  of  a  dynamic 
dictionary  expects  the  implementation  to  favor  the  heavier  items,  in  the  sense  that  an 
access  of  a  heavy  item  should  take  less  time  than  an  access  of  a  light  one,  a  split  at  a 
heavy  item  should  be  faster  than  a  split  at  a  light  one,  etc. 

The  weights  have  the  following  meaning  to  the  implementor  of  a  dynamic  dictionary. 
The  implementor  assumes  that  an  item  is  queried  with  probability  equal  to  the  proportion 
of  the  total  weight  represented  by  that  item.  Thus  if  a  dictionary  D  contains  (at  a  particular 
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time)  items  J\,  1%, Ik  with  weights  w,  =  W(7,)  for  *  =  1, ,  k,  it  has  total  weight 

W-  £  «„ 

l£i£k 

and  the  probability  of  a  command  involving  item  7<  is  (assumed  to  be) 

w< 

w* 

The  user  is  expected  to  assign  weights  to  items  with  this  in  mind. 

As  we  use  trees  to  implement  dictionaries,  we  make  the  following  standard  definition. 

Definition  1.1.  Let  T  be  a  tree  and  let  w  =  . . .  ,t »*)  be  the  lbt  of  the  weights  of  the 

items  stored  in  T.  The  total  weighted  path  length  of  T,  denoted  1 J[T),  is  given  by 

un-  £  in. 

i 


where  U  is  the  length  of  the  path  from  the  root  of  T  to  the  tth  item.  The  average  weighted 
path  length  of  T  is  simply 

M'H _  Y"  , 

W  ~ 


i£<£* 


We  will  measure  the  efficiency  of  an  implementation  of  an  operation  by  the  weighted 
average  of  the  times  needed  for  that  operation,  as  it  is  applied  to  each  item  in  the  dictionary. 
More  precisely,  consider  a  fixed  implementation  of  the  operations  and  a  fixed  dictionary  D 
with  items  as  above.  If  op  Is  one  of  the  operations  access,  find,  split,  or  delete,  and  if 
T  is  the  running  time  function,  define  the  cost  C( op)  of  the  operation  op,  as  applied  to  D, 
by  the  formula 

C(op)  as*  53  S  r(°p(7<))'  (u) 


Bach  of  these  operations  needs  to  select  an  item  from  the  entire  dictionary  based  on 
comparisons  among  the  keys.  Shannon's  theorem  on  perfect  encoding  [12,  p.  50]  says  that 
the  weighted  average,  taken  over  all  items,  of  the  number  of  binary  comparisons  necessary 
just  to  do  tills  selection  is  at  least  H (wi, . . . ,  to*),  where  H  is  the  discrete  entropy  function, 
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defined  by  the  formula 


H{wi,...,wk)  — 


_  w 


-  y  a,.!!:  '  ' 

2-*  W'*v>t 

(Here  lgx  means  log3x.)  Comparing  (1.2)  "with  (1.1),  we  see  that  the  best  we  can  hope 
to  do  is  to  implement  the  operations  to  run  in  time  lg(W/ie,).  Of  course,  the  operations 
involve  much  more  than  selecting  an  item,  so  we  will  be  content  with  an  implementation 
running  in  time  proportional  to  this  lower  bound.  In  other  words,  our  goal  is  to  implement 
each  operation  to  run  in  time  d(log(W/t0i)). 

The  other  operations  have  similar  “best”  running  times,  differing  only  in  the  way 
the  extra  parameters  enter  into  the  picture.  The  join  operation  takes  as  arguments  two 
dictionaries  D\  and  Z)3  with  total  weights  Wi  and  Wj.  By  symmetry,  we  may  assume  that 
Wi  >  Wg  and  define  the  cost  by  the  formula 


C(join)  =  ^  T(iom{DuD2)). 


By  analogy  with  the  above  discussion,  our  goal  for  the  join  operation  is  a  running  time  of 


„„  Wi  +  Wti 

08  wt 


Similarly,  the  promote  operation  should  run  in  time  0(Iog((W+d)/t»i)),  and  the  demote 
operation  should  run  in  time  0(log(W/(to<  —  $))). 

Determining  a  goal  for  the  insert  operation  poses  a  thorny  problem.  At  first  glance, 
it  seems  tempting  to  expect  a  running  time  of 


W  +  w^ 


to  insert  a  new  item  of  weight  to  into  a  dictionary  of  total  weight  W.  But  since  the  key  of 
the  new  item  might  lie  between  the  keys  of  two  very  light  items,  and  since  the  dictionary 
must  respect  the  key  ordering,  it  may  be  necessary  for  the  insert  operation  to  find  the 
two  light  items,  and  this  could  take  a  long  time.  More  precisely,  suppose  toi  and  wa  are 
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the  weights  of  the  items  in  D  that  immediately  precede  and  follow  the  new  item,  according 
to  their  keys.  Then  if  too  =  min(to,toi,  toj),  the  insert  operation  should  run  in  time 


0(log 


W  +  w 
too 


allowing  us  enough  time  not  only  to  insert  the  new  item  but  also  to  examine  its  two  new 
neighbors. 

In  the  worst  case  we  might  find  100  =  tom|n,  the  weight  of  the  lightest  item  in  the 
dictionary.  Our  revised  goal  is  then  much  worse  than  our  original  one  if  to  is  large.  But 
in  two  important  special  cases  this  pessimistic  analysis  is  irrelevant.  First,  if  the  new  item 
itself  has  weight  equal  to  tomin ,  the  two  goals  are  equal.  And  second,  in  the  fortunate  event 
that  the  new  item  precedes  or  follows  all  the  items  in  the  dictionary,  we  can  view  the  new 
item  as  a  dictionary  unto  itself,  and  view  the  insert  as  a  special  case  of  a  join.  The 
analysis  of  the  join  operation  then  gives  us  a  running  time  goal  of  0(log((iy  +  w)/w)). 

The  remarks  in  the  preceding  paragraphs  apply  to  any  operation  involving  an  item  or 
key  that  does  not  appear  in  the  dictionary.  On  the  one  hand,  it  is  tempting  to  ask  that 
the  time  necessary  to  perform  an  operation  depend  only  on  the  operands  (the  item  and  the 
dictionary),  and  not  on  slay  fine  structure  of  the  dictionary  such  as  the  weights  of  particular 
items  in  it.  But  on  the  other  hand,  it  seems  reasonable  to  expect  the  operation  to  examine 
the  “gap”  where  the  item  would  be  if  it  were  in  the  dictionary,  and  thus  to  depend  on  the 
weights  of  the  endpojpts  of  that  gap.  For  now  we  choose  reason  over  temptation. 

Definition  1.2.  Any  implementation  of  an  operation  which  runs  within  the  bounds  men¬ 
tioned  above  is  said  to  achieve  logarithmic  performance  (or  to  have  logarithmic  behavior, 
or  to  run  in  logarithmic  time). 

This  thesis  describes  in  detail  two  different  implementations  of  dynamic  dictionaries, 
called  biased  2-3  trees  and  biased  weight-balanced  trees.  These  two  data  structures  perform 
all  the  operations  of  Section  1.1  in  logarithmic  time  (with  one  technical  exception  for  biased 
2-3  trees),  provided  that  we  amortise  the  running  time  over  a  sequence  of  operations. 
The  nature  of  the  amortisation  is  particularly  pleasant:  starting  with  a  forest  of  trivia) 
dictionaries,  in  which  each  item  of  the  universe  is  a  dictionary  unto  itself,  the  total  time 
needed  for  any  sequence  is  at  most  the  sum  of  the  logarithmic  bounds  for  the  operations 
in  the  sequence,  even  though  the  time  needed  for  a  single  operation  may  be  more  than  its 
corresponding  bound.  In  other  words,  any  particular  operation  may  use  more  time  than 
the  logarithmic  bound,  but  only  if  previous  operations  used  correspondingly  less. 
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Furthermore,  the  amortization  applies  only  to  operations  that  alter  the  dictionaries, 
such  as  join  or  split.  The  access  and  find  operations  always  run  in  logarithmic  time, 
and  if  they  do  not  use  all  the  time  they  are  entitled  to,  the  extra  time  does  not  need  to  be 
made  available  to  later  operations. 


1.3.  Applications. 

The  initial  motivation  for  this  research  was  an  application  to  the  maximum  network 
flow  problem.  By  using  an  earlier  version  of  biased  2-3  trees  [6]  as  the  basis  of  a  sophisticated 
array  of  data  structures,  Sleator  [36]  presented  an  implementation  of  Dinits’  algorithm 
for  finding  a  maximum  flow  in  a  network  with  n  vertices  and  m  edges  that  runs  in  time 
0(mn log n),  improving  on  the  best  previously  known  bound  by  a  factor  of  logn.  The 
technique  he  used,  which  relies  on  biased  2-3  trees  to  represent  efficiently  certain  partially 
explored  paths  in  the  network,  applies  to  several  other  problems,  such  as  the  transshipment 
problem  and  finding  nearest  common  ancestors  [37].  These  applications  use  dictionaries  in 
the  bottom-up  manner,  and  never  need  to  insert  in  the  middle  of  a  tree. 

Another  application  is  a  “self-organizing”  data  structure.  The  problem  is  to  handle 
queries  on  items  whose  access  probabilities  are  not  known  a  priori,  but  to  give  an  item 
more  importance  if  it  appears  to  be  accessed  more  frequently.  Dynamic  dictionaries  are 
well-suited  to  the  task.  Each  access  is  coupled  with  a  promote  that  increases  the  weight 
of  an  item  by  1.  Thus  the  weight  of  an  item  is  its  reference  count;  in  the  long  run  the  ratio 
of  an  item’s  weight  to  the  total  weight  will  converge  to  its  (unknown)  access  probability. 
The  insert  problem  (see  Section  1.2)  is  irrelevant  here  since  we  insert  new  items  with 
weight  1,  the  minimum  weight  in  the  tree.  This  method  gives  logarithmic  performance 
and  maintains  a  tree  with  optimal  path  length  (up  to  a  constant  factor)  without  using 
heuristics.  Of  course  the  complication  of  saving  reference  counts  and  building  sophisticated 
data  structures  must  be  weighed  against  the  simplicity  of  heuristic  methods  [7,  35],  which 
use  simple  trees  or  lists  and  do  not  store  counts. 

Some  large  scale  data-base  problems  might  benefit  from  these  algorithms,  especially 
if  the  access  pattern  changes  drastically  with  time.  For  example,  an  airline  reservation 
system  could  have  seasonal  patterns  of  flights:  people  fly  south  in  the  winter  and  north  in 
the  summer.  Of  course,  the  weights  must  be  highly  skewed  in  order  to  achieve  a  savings 
over  less  complicated  data  structures. 
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1.4.  Related  work. 

The  notion  of  a  dynamic  weighted  data  structure  is  intended  to  simultaneously  general¬ 
ise  two  well-studied  classes  of  data  structures.  The  first  class  is  that  of  dynamic  data 
structures,  in  which  most  of  the  dynamic  operations  defined  in  Section  1.1  are  supported, 
but  in  which  all  items  are  assumed  to  be  of  equal  importance.  The  second  class  is  that  of 
weighted  data  structures,  in  which  items  are  of  unequal  importance,  but  in  which  the  set  of 
items  to  be  stored  and  the  assignment  of  weights  to  these  items  is  fixed  beforehand.  This 
section  briefly  reviews  the  work  that  has  been  done  on  these  two  classes  as  well  as  some 
earlier  work  on  dynamic  weighted  data  structures. 

1.4.1.  Dynamic  Data  Structures. 

A  great  deal  of  work  has  been  done  on  the  the  problem  of  implementing  a  data  structure 
which  supports  some  or  all  of  the  six  operations 
ACCBS8 
FIND 
JOIN 
SPLIT 
DBLBTB 
INSERT 

as  defined  in  Section  1.1,  in  the  case  where  all  the  items  have  equal  weights.  Of  course,  the 
promote  and  demote  operations  are  not  applicable  in  this  case. 

Although  linear  data  structures  such  as  arrays,  linked  lists,  and  hash  tables  can  be  used 
to  support  these  operations,  they  do  so  by  favoring  some  of  them  over  others.  For  example, 
we  can  join  two  linked  lists  in  time  0(1),  but  an  access  may  take  time  of  order  n;  we 
can  insert  into  a  hash  table  in  expected  time  0(1),  but  a  split  takes  time  of  order  n. 
We  take  the  point  of  view  that  this  phenomenon  is  undesirable  —  we  wish  to  minimise  the 
time  needed  for  the  worst  operation. 

The  most  appropriate  implementations,  therefore,  are  those  in  which  all  the  operations 
cost  about  the  same.  The  arguments  in  Section  1.2  specialise  in  the  case  that  all  the  weights 
are  the  same  (where  we  can  take  this  common  value  to  be  1),  and  indicate  that  wc  may 
expect  to  perform  any  of  the  legal  operations  in  time  0(logn)  on  a  dictionary  storing 
n  items.  A  large  class  of  data  structures  with  this  properly  exists,  namely  the  class  of 
balanced  trees. 

Balanced  trees  lead  U'  logarithmic  performance  by  keeping  the  bulk  of  the  tree  "bal¬ 
anced”  among  the  various  subtrees.  No  subtree  is  allowed  to  possess  too  much  or  too 
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Utile  of  the  bulk.  Various  measures  of  bulk  have  been  used,  various  balance  conditions 
have  been  proposed,  and  various  algorithms  have  been  designed  to  implement  the  dynamic 
operations  while  maintaining  the  balance  conditions.  The  most  important  measures  are 
height,  number  of  nodes,  number  of  children,  and  path  length,  leading  to  AVL  trees,  BB 
trees,  2-3  trees  (and  their  generaUsation  B-trees),  and  RB-trees,  respectively. 

AVL  trees,  invented  by  Adel’son-Vel’skiy  and  Landis  [1],  are  binary  trees  in  which  the 
bulk  of  a  subtree  is  measured  by  its  height.  The  heights  of  the  two  subtrees  under  a  common 
parent  must  not  differ  by  more  than  1;  the  resulting  trees  are  often  called  height-balanced 
for  this  reason.  By  storing  the  difference  of  the  children’s  heights  at  each  parent,  and 
applying  the  simple  operations  of  single  and  double  rotation,  it  is  possible  to  implement 
the  access,  insert,  and  delete  operations  in  time  O(logn).  A  nice  description  of  these- 
algorithms  appears  in  Knuth  (22).  In  his  dissertation,  Crane  showed  how  to  use  AVL  trees 
to  implement  all  six  dynamic  operations  [10]. 

BB  (bounded  balance)  trees  are  binary  trees  in  which  the  bulk  of  a  subree  is  measured 
by  the  number  of  nodes  in  that  subtree.  If  a  node  has  I  nodes  in  its  left  subtree  and  r  nodes 
in  its  right  subtree,  its  balance  is  defined  to  be 

/+  1 

/  +  r  +  2. 

This  balance  is  required  to  lie  between  a  and  1  —  a,  for  some  suitably  chosen  a.  Nievergelt 
and  Reingold  show  how  to  implement  insert  and  delete  in  these  trees  by  defining  single 
and  double  rotation  operations,  and  by  describing  when  they  are  applicable  [31,  34].  (In 
[34],  the  trees  are  called  “weight-balanced”.  The  “weight”  in  the  name  is  merely  the  number 
of  nodes  in  the  tree;  it  is  not  the  same  as  the  weight  of  an  item  as  defined  in  Section  1.1.) 
The  biased  weight- balanced  trees  of  Chapter  3  specialise  to  these  trees  if  we  set  the  weight 
of  each  item  to  2,  use  a  more  liberal  value  of  a,  and  notice  that  there  will  never  be  any 
sub-item  nodes. 

B-trees  [5]  are  multiway  trees  in  which  the  bulk  of  a  node  is  measured  by  the  number 
of  children  it  has.  For  m-ary  B-trees,  each  internal  node  is  required  to  have  between  [m/2] 
and  m  children,  and  all  leaves  are  required  to  be  the  same  distance  from  the  root.  The 
simplest  case  (and  most  important  theoretically)  is  a  3-ary  B-tree,  also  known  as  a  2-3  tree, 
in  which  each  internal  node  has  2  or  3  children.  It  is  fairly  straightforward  to  implement 
all  six  dynamic  operations  using  2-3  trees,  as  described  in  Chapter  4  of  [2]. 

Guibas  and  Sedgewick  invented  RB-trees  (short  for  Red-Black  trees,  also  known  as 
dichromatic  trees);  these  are  binary  trees  in  which  each  edge  is  colored  either  red  or  blaek 
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and  all  leaves  are  the  same  distance  from  the  root,  only  counting  black  edges.  Furthermore, 
certain  local  configurations  of  red  edges  are  disallowed.  By  choosing  these  configurations 
appropriately,  RB- trees  are  seen  to  generalise  both  AVL  and  B- trees,  as  well  as  other  types 
of  trees  (14). 

As  a  partial  step  toward  a  weighted  structure,  some  proposals  have  been  made  that 
enable  an  unweighted  structure  to  handle  local  reference  efficiently  [8,  26],  or  that  make 
the  likelihood  of  consecutive  expensive  rebalancing  steps  small  [18,  18,  88). 

1.4.2.  Weighted  Data  Structures. 

Some  work  has  been  done  when  the  items  are  weighted,  but  when  no  dynamic  opera¬ 
tions  are  used.  Knuth  [25]  shows  how  to  construct  the  binary  search  tree  with  the  op¬ 
timal  weighted  path  length  in  time  0(na);  Hu  and  Tucker  [17],  and  more  recently  Garsia 
and  Wachs  [13],  also  construct  optimal  trees  in  time  D(nlogn)  under  more  restrictive 
hypotheses. 

Huffman  trees  [20,  24,  33]  are  optimal  trees  in  which  the  items  have  weights  but  no 
keys;  that  is,  their  relative  order  in  the  tree  is  immaterial.  They  can  be  constructed  in  time 
O(nlogn)  under  very  general  hypotheses. 

Various  schemes  for  nearly-optimal  weighted  trees  have  appeared.  Fred  man  [11]  shows 
how  to  construct  a  nearly-optimal  tree  in  time  0(n),  and  Bayer  [4]  gives  a  good  bound  on 
how  close  to  optimal  it  gets.  Many  heuristics  and  empirical  results  have  also  been  shown 
for  weighted  trees  [3,  9,  32,  40]. 

1.4.3.  Dynamic  Weighted  Data  Structures. 

Mehlhorn  [28,30]  proposes  an  implementation  Tor  dynamic  dictionaries  called  D- trees 
(for  Dynamic  trees)  in  which  it  is  possible  to  achieve  logarithmic  behavior  for  the  access, 
promote,  and  demote  operations.  However,  there  is  a  potentially  non-linear  storage 
penalty  inherent  in  his  solution.  He  shows  how  to  avoid  this  penalty  (with  compact  D- 
trees),  but  the  necessary  manipulations  are  quite  complicated.  Our  implementation  is 
much  simpler  and  uses  only  linear  space,  although  we  only  achieve  amortised  logarithmic 
behavior  (see  below).  But  the  amortisation  applies  only  to  the  operations  that  change  the 
dictionary,  and  not  to  an  access  (or  a  kind),  which  can  always  be  done  in  logarithmic  time 
in  the  worst  case.  Furthermore,  the  split  operation,  which  is  essential  in  the  network  flow 
application,  is  fast  in  our  implementation.  Mehlhorn  does  not  discuss  the  split  operation, 
but  his  implementation  does  not  appear  to  admit  a  fast  algorithm  for  it. 

Unterauer  [39]  defines  B\/%  trees.  He  claims  they  support  insert,  delete,  and  weight 
changes,  but  his  operations  involve  searching  the  tree  to  find  the  successor  (in  key  order) 
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of  the  relevant  node,  and  doing  rebalancing  along  that  search  path.  Such  an  approach 
caanot  give  logarithmic  behavior,  because  any  operation  might  lead  to  the  bottom  of  the 
tree,  regardless  of  the  weights  of  the  operands. 

Knuth  (23]  shows  how  to  maintain  a  binary  tree  that  has  minimum  weighted  path 
length  under  the  operations  of  increasing  or  decreasing  a  weight  by  1,  in  essentially  optimum 
running  time.  This  structure,  like  Huffman  trees,  has  weights  but  no  keys. 

1.5.  Summary  of  results. 

In  this  thesis  we  propose  two  new  implementations  for  dynamic  dictionaries,  called 
biased  2-3  trees  and  biased  weight- balanced  trees.  They  achieve  logarithmic  performance 
(with  one  minor  exception  for  biased  2-3  trees),  but  only  when  we  a mortise  the  cost  of 
maintaining  the  data  structures  over  a  sequence  of  operations.  In  other  words,  a  particular 
operation  may  take  more  than  logarithmic  time,  but  the  extra  time  is  less  than  the  time 
saved  in  previous  operations. 

Biased  weight- balanced  trees  are  more  complicated  than  biased  2-3  trees,  as  they  store 
real  numbers  and  have  three  kinds  of  nodes  (as  opposed  to  storing  integers  in  two  kinds 
of  nodes).  However,  the  added  complication  circumvents  the  one  technical  imperfection  in 
biased  2-3  trees. 

The  way  amortisation  is  presented  here  is  more  explicit  than  previous  appearances 
of  the  idea.  We  introduce  a  physical  analog,  the  poker  chip,  which  seems  to  help  in 
understanding  where  all  the  time  is  eventually  spent.  We  prove  an  algorithm  runs  in  the 
appropriate  time  by  adding  up  the  number  of  chips  available  to  the  operation  (either  from 
the  initial  allocation  or  from  extra  chips  left  over  from  previous  operations),  and  showing 
that  the  total  exceeds  the  number  of  chips  the  operation  needs  to  spend. 

It  is  interesting  to  compare  the  time  bounds  for  this  data  structure  with  those  for  the 
weighted  path  compression  implementation  of  the  disjoint  set  data  structure  [38].  In  the 
latter  case,  one  can  do  a  sequence  of  n  union  and  find  operations  in  time  0(na(n,n))  but 
a  particular  operation  may  take  a  long  time  (of  order  log  n).  However,  a  long  operation  will 
be  followed  in  the  future  by  enough  short  ones  to  achieve  the  amortised  time  bound.  In  the 
present  case,  whenever  we  do  a  long  operation  we  can  prove  that  enough  short  operations 
have  been  done  in  t be  past  to  achieve  the  amortised  time  bound.  This  may  be  useful  for 
real-time  applications. 


Chapter  2 

Biased  2-3  Trees 


2.1.  Definitions  and  notation. 

Biased  2-3  trees  are  a  generalisation  of  2-3  trees  as  defined  by  Aho,  Hopcroft,  and 
Ullman  [2].  Whereas  all  the  leaves  in  a  2-3  tree  are  the  same  distance  away  from  the  root, 
leaves  in  a  biased  2-3  tree  appear  at  various  distances.  This  allows  heavier  items  to  be 
closer  to  the  root  than  lighter  ones. 

A  biased  2-3  tree  contains  two  kinds  of  nodes,  called  item  and  non-item  nodes.  An  item 
node  corresponds  to  an  external  node  (leaf)  of  a  2-3  tree;  it  has  no  children  and  contains 
one  item.  A  non-item  node  corresponds  to  an  internal  node  of  a  2-3  tree;  it  may  have 
either  two  or  three  children,  but  contains  no  items.  The  item  nodes  are  arranged  so  that 
traversing  the  tree  in  symmetric  order  visits  the  items  in. order  of  increasing  key  value. 

We  will  be  somewhat  sloppy  about  distinguishing  between  a  node  and  the  tree  rooted 
at  that  node;  the  context  should  resolve  any  ambiguity. 

If  the  dictionary  is  used  in  a  "top-down”  manner  (as  defined  in  Section  1.1),  then 
non-item  nodes  should  also  contain  additional  "access  keys”  to  guide  searches  for  items. 
One  possible  scheme,  as  described  in  Chapter  4  of  [2],  is  to  store  at  each  non-item  node  n 
two  extra  keys,  called  L  and  M,  equal  to  the  largest  items  in  the  left  and  middle  subtrees, 
respectively,  of  n.  These  values  must  be  updated  at  each  node  along  the  path  from  the  root 
to  n  whenever  the  tree  is  altered  at  node  n,  but  it  is  usually  a  simple  matter  to  determine 
the  new  values.  The  time  spent  doing  this  is  dominated  by  the  time  spent  altering  the 
tree,  since  any  alteration  requires  traversing  this  path,  at  least  in  the  schemes  described 
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here.  We  will  not  mention  the  quantities  L  and  M  further,  as  no  new  ideas  are  needed  to 
maintain  them;  they  have  the  same  meaning  as  for  2-3  trees. 

Unbiased  2-3  trees  are  balanced  by  requiring  that  all  leaves  are  the  same  distance  from 
the  root.  Biased  2-3  trees  have  a  more  complicated  balance  condition.  Before  giving  this 
condition,  we  define  a  measure  of  a  node,  called  its  rank,  roughly  corresponding  to  the 
height  of  a  node  in  a  2-3  tree. 

Definition  2.1.  The  raok  of  a  node  n,  denoted  r(n),  is  defined  as  follows: 

If  n  is  an  item  node  storing  item  7,  then  r(n)  =  [lg  W(I)\. 

If  »  is  a  non-item  node,  then  r(n)  =  1  +  max(r(m)),  where  m  ranges  over  the  children 
of  n. 

Next  we  define  a  useful  distinction  among  the  children  of  non-item  nodes. 

Definition  2.2.  A  major  node  is  one  whose  rank  is  maximum  among  all  its  siblings.  A 
minor  node  is  a  node  that  is  not  major. 

In  other  words,  a  major  node  is  one  that  is  responsible  for  its  parent’s  rank  being  as  large 
as  it  is.  With  this  terminology,  r(n)  =  1  +  r(n'),  where  »'  is  a  major  child  of  n. 

In  an  unbiased  2-3  tree,  all  items  have  the  same  weight,  say  1,  so  the  rank  of  each  leaf 
is  0.  Inductively,  we  see  that  the  ranks  of  the  children  of  an  internal  node  are  all  the  same, 
and  that  the  rank  of  a  node  equals  its  height.  In  a  biased  2-3  tree,  the  ranks  of  leaves  may 
vary,  but  we  would  still  like  the  ranks  of  the  children  of  an  internal  node  to  be  equal  (that 
is,  we  would  like  all  its  children  to  be  major  nodes).  This  turns  out  to  be  too  restrictive; 
we  cannot  achieve  this  goal  if  one  of  the  children  is  a  very  heavy  item  node.  But  we  can 
come  close,  as  the  following  definitons  indicate. 

Definition  2.3.  Two  trees  S  and  T  are  c-compatible,  for  a  given  integer  c,  if  r(S)  <  e, 
r(T )  <  c,  and  whenever  one  of  r(5)  and  r(T)  is  strictly  less  than  c,  the  other  tree  is  simply 
an  item  node  with  rank  c. 

Definition  2.4.  A  non-item  node  with  rank  e  +  1  is  balanced  if  each  adjacent  pair  of  its 
children  is  c-compatible.  A  tree  is  balanced  if  all  its  internal  nodes  are. 

In  other  words  a  node  with  rank  c  + 1  is  balanced  if  each  of  its  children  with  rank  less  than 
c  is  adjacent  only  to  item  nodes  with  rank  c.  In  still  other  words,  a  tree  is  balanced  if  each 
of  its  minor  nodes  is  adjacent  only  to  major  item  nodes. 

A  biased  2-3  tree,  then,  is  a  balanced  tree  made  up  of  binary  and  ternary  non-item 
nodes  and  item  nodes.  Besides  the  information  already  mentioned  (the  item  itself  for 
an  item  node  and  access  keys  for  a  non-item  node),  each  node  must  also  contain  enough 
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information  to  determine  its  rank,  aa  the  algorithms  for  maintaining  the  trees  depend 
strongly  on  the  ranks.  For  simplicity  we  will  assume  each  node  actually  contains  its  rank 
explicitly;  in  some  applications  it  may  be  more  efficient  or  convenient  to  use  a  less  direct 
system,  such  as  storing  the  difference  in  rank  between  a  node  and  its  parent 

This  completes  the  definition  of  a  biased  2-3  tree.  Figure  2.1  shows  a  typical  example. 
Item  nodes  are  shown  as  squares,  non-items  nodes  as  circles.  The  letter  intide  an  item  node 
is  the  key  of  the  item  stored  there;  the  number  above  a  node  is  its  rank.  Note  that  ranks 
may  be  negative  and  that  they  increase  along  the  path  toward  the  root  by  at  least  one. 
Access  keys  are  not  shown. 

Next  we  define  a  few  useful  notions. 

Definition  2.5.  The  weight  of  a  node  n,  denoted  W(n),  is  the  total  weight  of  all  the  items 
in  the  subtree  rooted  at  n. 

The  weight  of  a  node  is  the  measure  in  which  the  user  is  presumably  interested.  We  have 
diaeretised  it  through  the  rank  function,  but  we  will  have  to  ensure  that  our  algorithms, 
although  dealing  with  ranks,  repect  the  weights  in  some  sense.  See  Section  2.2  for  more 
details. 

If  5  is  a  non-trivial  tree  (that  is,  if  its  root  is  a  non-item  node),  define  5|  to  be  its 
leftmost  subtree,  Sr  to  be  its  rightmost  subtree,  and  Sm  to  be  its  "middle”  subtree  (provided 
S  is  ternary). 

While  maintaining  a  biased  2-3  tree,  we  may  detach  a  node  from  its  parent  and  replace 
it  with  another.  Usually  the  new  node  is  the  root  of  a  subtree  very  nearly  equal  to  the  tree 
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rooted  at  the  old  node,  perhaps  with  a  node  added  or  deleted.  When  we  reattach  the  new 
node,  we  have  to  check  that  the  parent  is  still  balanced.  The  following  definition  captures 
the  notion  that  the  new  node  can  replace  the  old. 

Definition  2.6.  Let  R  and  S  be  trees  with  ranks  r  and  s,  and  let  c  be  an  integer.  Then 
R  broadens  S  below  e  if 

i)  n  <  r  <  c, 

and  ii)  if  s  =  r  =  e  and  5  is  an  item  node,  then  R  is  also  an  item  node. 

The  broadening  relation  is  transitive,  that  is  if  R  broadens  S  which  in  turn  broadens  T, 
then  R  broadens  T,  below  c;  it  is  monotone  in  c,  that  is  if  R  broadens  S  below  c,  then  it 
also  does  so  below  any  a  >  e;  and  it  extends  the  parent  relation,  that  is  if  R  is  the  parent 
of  S,  then  R  broadens  S  below  r.  It  tells  us  when  we  can  replace  one  node  with  another, 
as  the  next  lemma  indicates. 

Lemma  2.7.  If  5  is  a  node  in  a  biased  2-3  tree  whose  parent  has  rank  c  +  1,  and  if  R 
broadens  S  below  c,  then  the  tree  obtained  by  replacing  S  with  R  is  a  (balanced)  biased 
2-3  tree. 

Proof.  Suppose  T  is  a  sibling  adjacent  to  S,  so  S  and  T  are  c-compatible.  We  must  prove 
that  R  and  T  are  also  c-compatible. 

If  a  <  c,  then  T  must  be  an  item  node  with  rank  c,  which  is  therefore  c-compatible 
with  any  node  R  with  rank  r  <  c.  If  t  <  c,  then  S  must  be  an  item  node  with  rank  s  =  c; 
therefore  r  =  c  by  (i)  and  R  is  an  item  node  by  (ii),  so  R  is  c-compatible  with  T.  The  only 
other  possibility  is  that  a  =  t  =  c,  in  which  case  r  =  c  by  (i),  and  so  R  is  c-compatible 
with  T. 


2.2.  Weight,  rank,  and  the  ACCESS  operation. 

Although  we  store  the  rank  of  a  non-item  node,  the  amount  of  weight  represented  by 
the  subtree  rooted  at  a  node  is  usually  more  important  to  the  user.  The  rank  of  a  node 
ought  to  be  related  to  its  weight  in  a  significant  way.  The  following  lemma  shows  that  a 
node  with  large  rank  represents  a  substantial  amount  of  weight. 

Lemma  2.8.  For  any  node  n  in  a  biased  2-3  tree, 


W(»)  >  2r(w)_l. 


D 
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Proof.  If  «  is  An  item  node,  a  stronger  result  is  true  since  lgW(»)  >  [lg  W(n)J  =  r(n),  so 

w(»)  >  fW. 

If  n  is  n  non-item  node,  it  has  among  its  children  either 

a)  one  major  item  node  n'  at  rank  r  —  1,  so  W(n)  >  Wfn')  >  2rW-1  (by  the  strong 
result  for  item  nodes),  or 

b)  at  least  two  major  nodes  ni  and  n?  at  rank  r  —  1,  so  W(n)  >  W(ni)  4-  W(nj)  > 

2 . 2r<"l“a  =  2rw-1. 


We  access  the  item  with  key  K  by  comparing  K  with  the  access  keys  of  nodes 
(beginning  at  the  root)  and  recursively  searching  the  appropriate  subtree.  This  is  the 
standard  tree  search  algorithm. 

Lemma  2.0.  The  access  time  for  an  item  is  proportional  to  the  length  of  the  path  from 
the  item  to  the  root. 

Proof.  At  each  step  along  the  path  from  the  root  to  the  item  node  containing  the  desired 
item,  we  do  a  constant  amount  of  work  comparing  the  search  key  with  a  small  (fixed) 
number  of  access  keys  and  deciding  which  subtree  to  search.  Thus  the  total  amount  of 
work  is  proportional  to  the  length  of  this  path. 

The  next  lemma  shows  that  this  path  is  not  very  long. 

Lemma  2.10.  The  length  of  the  path  from  an  item  to  the  root  is  at  most  lgfW/w)  +  2, 
where  to  is  the  weight  of  the  item  and  W  is  the  total  weight  of  the  tree. 

Proof.  Let  /  be  the  length  of  the  path.  Since  the  rank  of  the  item  is  [lgwj,  and  since  ranks 
increase  by  at  least  1  at  each  step  along  the  path  toward  the  root,  the  root  must  have  rank 
at  least  |lgwj  +  L  By  Lemma  2.8,  we  have 


W  >  2l|**J+l_l 
> 


so 


w 

I  <!«-  +  *• 

w 


The  cost  of  most  operations  on  biased  2-3  trees  needs  to  be  amortised,  but  not  so  for  the 
access  operation  (nor  for  the  find  operation,  for  which  the  results  of  this  section  also  hold). 
The  access  operations  can  always  be  done  in  logarithmic  time,  without  amortisation. 


.  v  *.  ,v.  «  /'A  .S'-  .*• .  •  ;•  .*•  r*  ;■  /- 
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Figure  2.2. 

Case  1,  two  large  nodes. 


E3  e  ©  =*  □  © 

Figure  2.8. 

Case  2,  S  is  a  large  item  node. 


2.8.  The  JOIN  operation. 

The  fundamental  algorithm  for  maintaining  biased  2-3  trees  is  called  partial  join  (or 
pjoin  for  short).  It  takes  as  input  two  trees  S  and  T  and  an  integer  e,  and  tries  to  join 
the  trees  into  a  single  tree  with  rank  c  (or  less).  If  this  is  not  possible,  it  returns  two 
e-compatible  trees.  For  simplicity,  the  algorithm  is  given  assuming  that  S  has  rank  at  least 
as  large  as  T;  the  other  case  is  handled  symmetrically. 

Algorithm,  fjoin  S  and  T  at  rank  e. 

Input:  Two  trees  S  and  T,  with  ranks  a  and  t  respectively.  Integer  e. 

Preconditions:  S  precedes  T,  and  <<*<«. 

Output:  ESther  (i)  one  tree  R,  or  (ii)  two  c-compatible  trees  Ri  and  R%. 

Postconditions:  Either  (i)  r  <  e,  R  broadens  both  S  and  T  below  c,  and  R  stores  the 
items  of  5  and  T  in  key  order,  or  (ii)  Ri  and  R^  are  c-compatible,  Ri 
broadens  S  below  c,  Rj  broadens  T  below  c,  Ri  precedes  R a,  and  Ri 
and  R%  store  the  items  of  5  and  T  in  key  order. 

By  symmetry,  assume  a  >  t.  Call  a  node  large  is  it  has  rank  c,  otherwise  call  it  small  if 
it  has  rank  less  than  e.  A  large  node  will  become  a  major  node  if  it  is  attached  to  a  new 
parent  with  rank  e  +  1.  There  are  five  cases,  depending  on  the  configuration  of  S  and  T\ 
the  algorithm  simply  uses  the  first  case  that  applies. 

Case  I.  [S  and  T  are  both  large  nodes.]  If  a  =  t  —  c,  then  return  in  case  (ii)  with  Ri  —  S 
and  Ra  =  T.  See  Figure  2.2. 

Case  2.  (S  is  a  large  item  node.]  (At  this  point,  t  <  c.)  If  S  is  simply  an  item  node  and 
e  =  c,  return  in  case  (ii)  with  R%  =  5  and  Ra  —  T.  See  Figure  2.3. 
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Figure  2.4. 

Case  3,  5  is  a  small  item  node. 
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Figure  3.B. 

Case  4a,  two  unequal  small  nodes. 
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Figure  2.0. 

Case  4b,  two  equal  small  nodes. 

Case  3.  [5  is  a  small  item  node.]  (At  this  point,  if  5  is  an  item  node,  then  s  <  e.)  If  S  is 
an  item  node,  create  a  new  binary  non-item  node  R  with  rank  s  + 1,  set  J&  ♦-  S 
and  Rf  «-  T,  and  return  in  case  (i).  See  Figure  2.4. 

Case  4.  (5  and  T  are  both  small  nodes.]  (At  this  point,  5  is  not  an  item  node.)  If  a  <  c, 
recursively  pjoin  8  and  T  at  rank  s,  then  distinguish  two  subcases: 

a)  If  the  recursive  pjoin  produced  one  tree  F,  return  in  case  (i)  with  R  ***  Rf. 
See  Figure  2.5. 

b)  Otherwise  it  produced  two  s-compatible  trees  Rfx  and  Rfit  so  create  a  new 
Unary  non-item  node  R  with  rank  s  + 1,  set  ♦-  Rfx  and  R,  «-  Rf,  and 
return  in  case  (i).  See  Figure  2.0. 

Case  5.  [5  is  a  large  non-item  node,  T  is  a  small  node.]  (At  this  point,  t  <  *  **  c,  and 


Figure  2.9. 

Case  5c,  5  is  a  ternary  non-item  node. 


S  is  not  an  item  node.)  Otherwise,  recursively  pjoin  Sr  and  T  at  rank  s  —  1,  then 
distinguish  three  subcases: 

a)  [Replace  Sr.]  If  the  recursive  pjoin  returned  one  tree  If,  then  set  Sr  *-  R1 
and  return  in  case  (i)  with  R  =  S.  See  Figure  2.7. 

b)  [Change  binary  to  ternary.]  Otherwise  the  recursive  pjoin  returned  two 
(a  —  l)-compalibic  trees  R\  and  /fj.  If  S  was  binary,  then  set  Sm  -  R\, 
Sr  «-  i?'a,  and  return  in  case  (i)  with  R  =  S.  See  Figure  2.8. 

c)  (Split  ternary  node.]  If  S  was  ternary,  then  set  S:  «-  Sm,  and  Sm  *-  0. 
Create  a  new  binary  non-item  node  R  with  rank  a,  set  Ri  «—  R\,  Rr  <—  R?, 
and  return  in  case  (ii)  with  R\  =  S  and  R%  =  R.  See  Figure  2.9. 

Proposition  2.11.  The  pjoin  algorithm  is  correct. 
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Proof.  It  suffices  to  show  that  the  pjoin  algorithm  produces  balanced  trees  satisfying  the 
postconditions,  assuming  the  input  satisfies  the  preconditions.  It  is  easy  to  see  that  the  items 
end  up  in  the  correct  order,  so  it  remains  to  prove  that  the  broadening  and  compatibility 
conditions  are  met,  and  that  the  resulting  trees  are  balanced.  There  is  nothing  to  prove  in 
Cases  1  and  2,  because  any  tree  broadens  itself.  In  Case  3,  S  and  T  are  s-compatible  so  R 
is  a  balanced  tree,  and  since  t<a<r  =  a  +  l<c,  R  broadens  both  S  and  T  below  c. 

Having  disposed  of  the  base  cases,  we  may  assume  that  any  recursive  calls  to  pjoin 
work  correctly.  In  Case  4a,  R '  broadens  S  and  T  below  a,  hence  R  broadens  S  and  T 
below  c  >  a.  Case  4b  is  like  Case  3,  noting  that  R  broadens  R\  and  R\  broadens  S,. 
bo  R  broadens  S  below  c  (similarly  for  T).  In  Case  5a,  Sr  was  (•  —  Incompatible  with 
its  neighbor,  so  R'  must  be  too,  since  it  broadens  ST  below  c  —  1;  thus  R  is  a  balanced 
tree  that  broadens  S  and  its  child  R',  and  hence  it  also  broadens  T.  In  Case  5b,  Sr  was 
(s  —  l)-compatible  with  Si,  so  R\  must  be  too;  thus  R  is  a  balanced  tree  that  broadens  S 
and  its  children,  hence  it  also  broadens  T.  Finally  in  Case  5c,  R\  and  Rt  are  balanced  trees 
at  rank  a,  since  Si  and  R\  were  respectively  (s  —  l)-compatible  with  Sm  and  R!2‘,  they  are 
e-compatible  since  they  both  have  rank  a  =  c,  and  they  broaden  5  and  T  below  c  because 
S  is  not  an  item  node  and  t  <.  c. 

The  algorithm  to  join  S  and  T  is  now  easily  written  as  “pjoin  S  and  T  at  rank  s  + 1." 
Since  both  nodes  are  small,  the  pjoin  will  start  in  Case  3  or  4,  and  will  return  one  tree  R 
with  all  the  desired  properties. 


2.4.  Charging  arguments. 

Our  analysis  of  the  join  and  split  operations  will  have  to  take  into  account  the 
property  that  time  used  by  one  operation  might  have  to  be  charged  against  an  earlier 
operation.  For  bookkeeping  purposes,  imagine  allocating  poker  chips  to  an  operation  in 
a  quantity  proportional  to  the  time  we  expect  the  operation  to  take.  Our  algorithms  can 
then  spend  one  poker  chip  to  do  a  fixed  amount  of  processing,  usually  corresponding  to  one 
level  of  recursion.  If  they  ever  need  more  chips  than  they  were  allocated,  they  must  find 
the  extra  chips  somewhere  in  the  data  structure.  Conversely,  if  they  finish  the  operation 
before  running  out  of  chips,  they  may  leave  the  surplus  in  the  data  structure  for  future 
operations  to  use.  The  chips  are  not  part  of  the  data  structure,  but  merely  a  useful  fiction 
to  help  us  prove  the  time  bounds. 

The  following  definition  describes  the  way  in  which  chips  afTect  the  rank  of  a  node. 
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Figure  2.10. 

A  completely  cast  biased  2-3  tree. 

Definition  2.12.  A  biased  2-3  tree  with  rank  j  is  cast  to  rank  k  if  there  are  k  —  j  chips 
piled  on  its  root  (assuming  j  <  k). 

Intuitively,  the  extra  chips  make  the  tree  appear  to  have  rank  k,  in  that  we  can  allocate 
chips  to  operations  involving  this  tree  as  though  it  had  rank  k,  knowing  that  any  additional 
processing  we  may  have  to  do  because  the  tree  really  has  smaller  rank  can  be  paid  for  using 
the  chips  in  the  cast. 

The  main  difference  between  biased  2-3  trees  and  unbiased  2-3  trees  is  that  the  children 
of  a  node  in  a  biased  2-3  tree  might  have  different  ranks,  whereas  the  children  of  any  node 
in  a  unbiased  2-3  tree  all  have  the  same  height.  We  can  overlook  this  difference  if  we 
place  casts  on  the  children  to  make  them  appear  to  have  the  same  rank.  We  say  a  tree  is 
completely  cast  if  it  satisfies  the  following  invariant.  . 

Chip  Invariant.  Each  child  of  a  node  with  rank  k  is  cast  to  rank  1  —  1. 

Figure  2.10  shows  the  tree  of  Figure  2.1,  completely  cast.  Since  the  keys  are  the  same, 
we  have  put  the  ranks  inside  the  nodes;  the  number  above  a  node  is  now  the  number  of 
extra  chips  placed  on  that  node  as  part  of  a  cast. 

Our  problem  now  is:  Given  a  completely  cast  tree  (or  trees),  an  operation  to  perform, 
and  a  time  bound  to  meet,  can  we  use  a  number  of  poker  chips  proportional  to  the  time 
bound  to  pay  for  the  operation,  leaving  the  resulting  tree  (or  trees)  completely  cast?  A 
positive  answer  to  this  question  will  justify  the  phrase  “chip  invariant”. 
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2.5.  Running  time  of  the  JOIN  operation. 

Following  the  notation  of  the  join  algorithm,  let  a  and  t  be  the  ranks  of  the  trees  S 
and  T,  and  let  e  be  the  rank  of  the  cast  in  which  the  pjoin  is  to  occur.  The  join  algorithm 
runs  in  time  proportional  to  a  —  t  +  2  (amortised);  that  is  to  say,  given  a  —  t  +  2  chips 
the  algorithm  can  pay  for  the  computation  it  does  according  to  the  accounting  scheme 
described  below.  In  general  this  scheme  pays  one  chip  for  each  level  of  recursion,  except 
for  a  few  special  cases  in  which  a  recursive  call  to  pjoin  is  done  “for  free”.  These  free 
calls  involve  only  transfer  of  control  and  are  not  available  to  external  users;  their  expense 
is  charged  to  the  calling  procedure. 

The  join  algorithm  spends  one  chip  against  the  possibility  that  the  top-level  call  to 
pjoin  turns  out  to  be  “free”;  the  other  a  —  t  +  1  chips  are  given  to  the  pjoin  algorithm. 
The  next  theorem  proves  that  the  pjoin  algorithm  has  enough  chips  to  spend,  and  thus 
that  the  join  algorithm  runs  in  time  a  —  t  +  2  as  desired. 

Theorem  2.13.  The  pjoin  algorithm  runs  in  time  e—t  (amortised). 

Proof.  The  analysis  of  the  pjoin  operation  divides  into  cases  according  to  the  cases  taken 
by  the  algorithm.  In  each  case  we  have  available  the  e  —  t  chips  allotted  to  the  operation 
(which  we  say  come  from  the  cashier),  plus  perhaps  some  additional  chips  found  in  the 
tree  due  to  the  chip  invariant.  We  spend  chips  in  each  case  for  three  reasons:  to  pay  for 
recursive  calls  to  pjoin,  to  cast  resulting  trees  to  the  rank  required  to  maintain  the  chip 
invariant,  and  to  pay  for  the  work  actually  done  in  the  case  itself  (which  we  call  overhead). 

Cases  1,  2,  and  4a  are  not  charged  overhead,  and  a  special  argument  is  needed  to  justify 
this.  The  only  work  done  in  these  cases  is  transfer  of  control;  they  do  not  modify  the  data 
structure  (except  by  recursive  call).  The  overhead  in  these  cases  is  charged  to  the  caller. 
This  may  seem  wrong,  and  indeed  it  would  be  wrong  either  if  the  caller  were  an  external 
user  (rather  than  one  of  the  procedures  responsible  for  maintaining  the  data  structure)  or 
if  there  were  an  unbounded  chain  of  recursive  calls  involving  only  these  cases.  However, 
external  users  call  join  rather  than  pjoin,  and  the  maximum  length  of  any  chain  involving 
only  these  cases  is  2  (Case  4  may  call  on  Cases  1  or  2,  but  no  other  calls  among  these  cases 
are  possible).  It  seems  to  be  necessary  to  resort  to  this  piece  of  creative  accounting  in  order 
to  prove  the  theorem  in  the  strong  form  given  here. 

The  following  tables  summariie  the  number  of  chips  in  each  case.  The  theorem  follows 
by  observing  that  the  tables  are  correct  and  complete,  aud  that  the  number  of  available 
chips  always  exceeds  the  number  spent. 
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Case  1: 


Given 

e  —  t  from  cashier 


0 

since  t  —  ein  this  case. 


Case  2: 


Given 

e  —  t  from  cashier 


Needed 

0 


Needed 

e  —  t  to  cast  Ri 


Case  3: 


Given 

e  —  t  from  cashier 


e  —  t 


Needed 

a  — t  to  cast  T 
c  —  (s  + 1)  to  cast  R 
1  overhead 


e  —  t 


Case  4a: 


I 

I 

I 

Case  4b: 


Given 

c  —  t  from  cashier 
e  —  t 

Given 

e  —  t  from  cashier 

e  —  t 


Needed 

a  — t  recursive  call 
e  —  a  to  cast  R 

e  —  t 


Needed 

a  —  t  recursive  call 
c  —  (»  +  1)  to  cast  R 
1  overhead 

e  —  t 


Case  5: 

f 


Given 

e  —  t  from  cashier 
(s  —  1)  —  *r  on  Sr 


(•- 


Needed 


1)  —  min(*r,t) 
e—  a 
l 


recursive  call 
to  cast  R,  Ri,  and 
overhead 


e  +  (t—  1)  —  ar  —  t  c  —  min(a,,t) 

We  need  no  chips  to  cast  R,  Ri  and  Jfj,  since  •  =  c  in  Case  5.  There  are  enough 
chips  because  (s  —  1)  —  a,  >  0  (since  Sr  is  a  child  of  S),  and  because  (s  —  1)  —  t  >  0 
(since  S  is  large  and  T  is  small).  So  regardless  of  whether  ar  or  t  is  smaller,  the 
“given"  total  is  as  large  as  the  “needed”  total. 
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We  have  proved  that  the  join  algorithm  joins  two  treea  together  in  time  proportional 
to  the  difference  in  their  ranks.  Unfortunately,  this  is  not  quite  good  enough  to  achieve 
logarithmic  performance.  It  is  easy  to  show  that  a  tree  with  rank  t  may  have  total  weight 
about  S'.  To  join  it  to  a  tree  with  rank  •  and  total  weight  about  2*  takes  time  proportional 
to  t—t,  whereas  we  had  hoped  it  to  take  time  proportional  to  lg(2*/3‘)  =  *  —  t— (Ig  3  —  l)t. 
The  extra  term  is  very  annoying,  but  it  seems  to  be  price  we  pay  for  discretising  the  weights 
into  integer  ranks.  Unbiased  2-3  trees  have  this  same  problem;  they  can  be  JOiNed  in  time 
proportional  to  their  height  difference,  but  not  in  time  proportional  to  the  logarithm  of 
their  Me  ratio. 

One  mitigating  fact  is  that  the  running  time  of  join  satisfies  a  “telescoping”  property. 

If  we  have  a  sequence  T\ . Tn  of  trees  with  successively  increasing  ranks,  joming  them 

all  into  one  large  tree  T  takes  time  0(r(T)  —  r(Ti)),  since  the  successive  rank  differences 
ti  —  <*•_ i  and  ii+i  —  U  have  cancelling  terms.  This  property  is  often  good  enough  for  many 
applications;  in  particular,  it  is  used  implicitly  in  the  proof  that  the  split  algorithm  works 
in  logarithmic  time. 


2.8.  The  SPLIT  Operation. 

The  split  operation  takes  a  tree  S  with  rank  s  and  a  key  K,  and  returns  an  item 
node  I  storing  the  item  of  K,  as  well  as  two  trees  L  and  R  storing  the  left-  and  right-items 
of  K.  The  trees  may  be  null  if  the  corresponding  set  of  items  is  empty.  For  simplicity  we 
assume  that  S  contains  the  item  of  K  at  rank  Jfc;  a  simple  change  to  Case  1  of  the  algorithm 
allows  the  split  operation  to  work  when  K  is  not  in  S,  but  its  running  time  becomes  more 
difficult  to  analyse  due  to  the  uncertainty  about  the  length  of  the  path  to  the  “misting” 
key.  See  Section  1.2,  where  this  problem  also  arises  with  the  insert  operation. 

The  split  operation  works  as  follows: 

Algorithm,  split  S  at  K. 


Input:  A  tree  S  with  rank  s.  A  key  K. 

Preconditions:  S  contains  the  item  of  K  in  an  item  node  with  rank  k. 


Output:  A  tree  L  (or  null),  an  item  node  /,  and  a  tree  R  (or  null). 


Postconditions:  L  stores  the  left-items  of  K,  I  stores  the  item  of  K,  and  R  stores  the 
right-items  of  K.  Also  rlL)  <  »  and  r(l?)  <  s. 


St)  *»)  5, 


FigttN  1.11. 

A  biased  2*3  tree  about  to  be  split. 
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Figure  1.11. 

Case  1,  S  is  an  item  node. 
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Figure  3.M. 

Case  2a,  split  the  left  subtree  of  a  binary  node. 


There  are  four  cases;  the  algorithm  uses  the  one  that  applies.  Figure  2.11  shows  a  tree 
ready  to  be  split. 


Case  1.  [5  is  an  item  node.]  If  5  is  an  item  node,  set  L  «-  R  «-  0,  /  «-  S,  and  return. 
See  Figure  2.12. 

Case  2.  \K  is  in  left  subtree.]  If  Si  contains  K,  then  recursively  split  Si  at  K,  obtaining 
L,  /,  and  IV.  Now  distinguish  two  subcases: 

a)  If  5  was  binary,  then  pjoin  R'  and  Sr  at  rank  s  to  obtain  R ,  and  return.  See 
Figure  2.13. 

b)  If  S  was  ternary,  then  set  Si  *-  Sm  and  Sm  *-  9,  pjoin  R1  and  5  at  rank  s 
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Figure  3.14. 

Caw  3b,  split  the  left  subtree  of  e  ternary  node. 


\ 


Figure  3.15. 

Case  3,  split  the  center  subtree  of  a  ternary  node. 


to  obtain  R,  and  return.  See  Figure  3.14. 

'Case  3.  [if  is  In  the  center  subtree.]  If  Sm  contains  K,  then  recursively  split  Sm  at  K 
to  obtain  V,  I,  and  Rf.  Now  pjoin  Si  and  V  at  rank  s  to  obtain  L,  pjoin  Rf  and 
Sr  at  rank  s  to  obtain  R,  and  return.  See  Figure  3.15. 

Case  4.  [K  is  in  the  right  subtree.]  Case  4  is  completely  symmetric  with  Case  3.  See 
Figure  2.10  and  Figure  3.17. 


Proposition  3.14.  The  split  algorithm  is  correct. 

Proof.  Case  1  is  obviously  correct,  since  S  stores  the  item  of  K  by  assumption.  The  other 
cases  return  balanced  trees  storing  the  correct  sets  of  items  and  with  the  proper  ranks,  since 
the  pjoin  algorithm  is  correct.  The  only  subtle  point  is  proving  that  the  calls  to  pjoin 
return  one  tree  as  we  have  assumed,  in  other  words  that  the  pjoin  returns  in  Case  (i)  (see 
Section  2.3).  In  Cases  2a  and  3  we  pjoin  (at  rank  s)  two  trees  with  ranks  s  -  1  or  less,  so 
the  pjoin  is  called  in  Case  3  or  4  and  must  return  one  tree.  In  Case  2b  we  pjoin  (at  rank 
s)  a  binary  tree  at  rank  s  and  a  tree  at  rank  s  —  1  or  less;  the  pjoin  is  called  in  Case  5b 
and  returns  one  tree. 


BIASED  3-S  TREES 


Figure  2.K. 

Csse  4a,  split  the  right  subtree  of  a  binary  node. 


Figure  2.17. 

Case  4b,  splitting  the  right  subtree  of  a  ternary  node. 


2.0.1.  Running  time. 

As  usual,  the  running  time  of  the  split  operation  is  analysed  using  poker  chips.  We 
assume  the  input  tree  satisfies  the  chip  invariant,  and  that  we  are  supposed  to  cast  the 
non- null  output  trees  to  rank  c,  where  c  >  s.  The  recursive  calls  in  Cases  2,  3,  and  4 
should  all  east  their  results  to  rank  s  —  1.  The  casts  were  not  mentioned  in  the  split 
algorithm  to  emphasise  that  they  are  needed  only  for  the  running  time  analysis. 

Theorem  2.15.  The  split  algorithm  uses  at  most  3(c  —  k)  +  1  chips. 

Proof.  As  for  pjoin  the  proof  is  mostly  by  construction  of  tables  showing  the  number  of 
available  chips  and  the  number  of  chips  needed.  However,  the  arguments  are  slightly  more 
complicated;  after  adding  up  the  columns  we  will  subtract  a  certain  quantity  of  chips.  As 
long  as  we  subtract  a  larger  number  from  the  "Given”  column  than  from  the  "Needed” 
column,  we  do  not  affect  the  result.  In  each  case,  this  will  be  true  because  the  subtrees 
involved  all  have  rank  at  most  »  —  1. 

Case  1: 


Given 

3(c  —  k)  + 1  from  cashier 


Needed 
1  overhead 
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Cue  It: 


3(e  —  4)  + 1  from  cashier 
(•  - 1)  -  «i  on  St 

(i  —  1)  —  on  Rf 

(•-!)-  sr  on  Sr 


-(  *  - 1  -  •/' 
—  (s  —  r#  +  s-sr-l 

3c  -  34) 


Cue  3b: 

Given 

3(e  —  it)  +  1  from  cashier 
(•  —  1)  —  #|  on  St 
(<  —  1)  -V  on  /? 


-  (  S  -  1  -  S| 

Sc  +  s-3  k-T* 


Needed 

1  overhead 


3(*  —  1  —  4)  +  1 
s  —  min(r^,  sr) 
c  —  • 
e-(s-l) 


recursive  SPLIT 
PJOIN 

to  cast  it 
to  cast  L 


0) 

s  -  min(r',  sr)  ) 


2c  +  s  —  34 


Needed 


3(s  -  1  -  4)  +  1 
i  —  r1 
e  —  s 
e-(s-l) 


overhead 
recursive  split 
PJOIN 
to  cast  A 
to  cast  L 


_ 0) 

2c  +  s  —  34  —  t* 


Caw  3: 


Given 

3(e  —  4)  +  1  from  cashier 
(s  -  1)  -  sm  on  Sm 
(s  —  1)  —  »i  on  St 
(s  —  1)  —  sr  on  Sr 
(s-l)-l'  on# 

(s  —  1)  —  r1  on  Rf 


Nisdfid 


1 

3(s  - 1  -  4)  +  1 
s  —  min(si,  /') 
s  —  min(r',  s,) 
c  —  s 
c  —  s 


overhead 
recursive  split 

PJOIN 
PJOIN 
to  cast  L 
to  cast  R 


-(  *  -  1  -  •« 
- (  s-si  +  s-  f'  —  1 
-(  S-r'  +  S-Sr-l 


0) 

s  —  min(s{,  V)  ) 
s-ndn^fr) ) 


3c- 34-1 


2c  +  •  —  34  —  1 


Since  s  <  c,  all  thew  tables  have  the  required  property  that  the  given  column  totals  to 
more  than  the  needed  column. 
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2.7.  Other  operations. 

The  remaining  operations  on  dynamic  dictionaries  can  all  be  implemented  for  biased 
2-3  trees  in  terms  of  the  join  and  split  operations.  They  all  work  in  logarithmic  time 
(except  for  insert,  as  discussed  in  Section  1.2).  The  critical  observation  is  that  the  split 
algorithm  leaves  its  output  cast  to  the  rank  of  its  input.  Thus  we  can  delete  an  item  from 
a  tree  by  doing  a  split  at  that  item  and  using  the  chips  on  the  two  resulting  trees  to  join 
them  together  again  at  essentially  no  extra  cost.  Similarly,  the  promote  and  demote 
operations  can  be  done  by  doing  a  split  at  the  appropriate  node,  changing  its  weight,  and 
reattaching  the  node  into  the  two  trees,  using  the  chips  from  the  casts  to  pay  for  the  joins. 
This  technique  also  works  for  insert,  but  since  the  original  split  may  have  to  go  very 
deep  into  the  tree,  the  time  for  insert  is  at  worst  proportional  to  the  difference  in  ranks 
between  the  root  and  the  lightest  node  in  the  tree  (including  the  new  node).  However,  this 
method  of  doing  an  insert  satisfies  the  goals  mentioned  in  Section  1.2. 

Algorithm.  Delete  K  from  S. 

Step  1.  (Detach  the  item.]  split  $  at  K  to  form  L,  /,  and  R. 

Step  2.  [Reassemble  the  tree.]  join  L  and  R  to  form  S'. 

Taking  the  casts  into  account,  Step  1  needs  3(*  —  lb)  +  1  chips  and  Step  2  needs  s  —  a  +  2, 
for  a  total  of  3(a  —  k)  + 3.  By  Lemma  2.10,  this  is  at  most  3\g[W/to)  +  9. 

Algorithm,  demote  K  in  S  by  6. 

Step  1.  [Detach  the  item.]  split  S  at  K  to  form  L,  /,  and  R. 

Step  2.  [Change  the  weight.]  Decrease  W(I)  by  6  (checking  that  W(I)  >  0),  and  update 
r(7). 

Step  3.  [Reattach  right  tree.]  join  /  and  R  to  form  R 
Step  4.  [Reassemble  the  tree.]  join  L  and  R!  to  form  S'. 

Taking  casts  into  account,  the  steps  have  respective  costs  of  at  most  3(s  —  k)  + 1, 1,  s  — 1  +  2, 
and  t  +  1  —  s  +  2,  where  t  =  [lg(to  —  6)J  is  the  new  rank  of  /.  Since 

s  -  t  =  (s  -*)  +  (*-  t)  <  (Ig(Wyifl)  +  2)  +  [lgtoj  -  [lg(w  -  $)J 

<  IgVK  —  Igw  +  ig  w  —  lg(u>  -  £)  +  1 


and  since  lg(lV/w)  <  lg(W/(w  -  6)),  the  total  cost  is  at  most  4  lg(kV/(to  -  £))  +  14,  by 
Lemma  2.10. 
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Alfockkn.  promote  K  in  S  by  S. 

Step  I.  [Detach  the  Item.]  split  5  at  K  to  form  L,  I,  end  R. 

Step  2.  [Change  the  weight]  Increase  W(T)  by  6  and  update  r(I). 

Step  S.  [Reattach  right  tree.]  join  I  and  R  to  form  R *. 

Step  4.  (Reassemble  the  tree.]  join  L  and  Rf  to  form  S'. 

Taking  caste  Into  account,  the  steps  have  respective  costs  of  at  most  3(s— fc)+ 1, 1,  max(t— 
s+2,s-t+2),  and  max(t+l— s+2,  s+1— #4-2),  where  t  =  [lg(to+tf)J  is  the  new  rank  of  /. 
If  t  <  s,  then  since  t  >  k  we  have  s  —  t  <  #  —  k,  and  the  whole  algorithm  needs  at  most 
4  lg((W  +  6)/w)  +  15  chips,  by  Lemma  2.10.  If  t  >  s,  then  since  »  >  k  >  igw  —  1  we  have 
t  —  t  <  1*(W  +  S)  —  (Igw  —  1),  and  the  whole  algorithm  needs  at  most  5 lg((VK  +  tf)/w)  4- 15 
chips.  In  dther  case,  the  algorithm  needs  at  most  51g((W  +  f)/w)  +  15  chips. 

Algorithm,  insbrt  K  into  S. 

Step  1.  [Disassemble  the  tree.]  split  S  at  K  to  form  L  and  R. 

Step  2.  [Reassemble  right  tree.]  join  1(K)  and  R  to  form  Rf. 

Step  3 .  [Reassemble  the  tree.]  join  L  and  Rf  to  form  S'. 

The  number  of  chips  needed  depends  on  k,  the  rank  at  which  the  search  for  K  terminates 
(namely  the  rank  of  one  of  the  items  neighboring  the  gap  where  K  belongs),  and  on  t  = 
[lg  W(K)\,  the  rank  of  the  new  item.  The  steps  have  respective  costs  at  most  3 (a  —  k)  +  1, 
max(a  —  t  +  2,f  —  s  +  2),  and  max(>  +  1  —  e  +  2,t+l  —  s  +  2),  which  means  the  algorithm 
has  the  behavior  described  in  Section  1.2. 


Chapter  3 

Biased  Weight-Balanced  Trees 


8.1.  Introduction  and  definitions. 

Just  as  we  can  obtain  an  efficient  dynamic  weighted  data  structure  from  a  2-3  tree 
by  relaxing  the  balance  constraints  near  a  heavy  item  node,  we  can  obtain  a  second 
implementation  of  a  dynamic  weighted  dictionary  by  relaxing  the  balance  constraints  in  a 
weight- balanced  tree.  The  resulting  class  of  trees  will  have  logarithmic  performance,  even 
for  the  join  operation.  Thus  it  is  possible  to  eliminate  the  discretisation  problem  present  in 
biased  2-3  trees,  but  at  the  price  of  having  to  store  and  manipulate  real  numbers  (weights) 
in  the  internal  nodes  rather  than  integers  (ranks). 

Weight-balanced  trees  were  proposed  by  Nievergelt  and  Ecingold  [31,  34],  who  origin¬ 
ally  (and  more  appropriately)  called  them  trees  of  bounded  balance.  Generalising  their 
definition,  which  applied  only  to  the  case  where  the  weight  of  a  node  was  1  +  (the  number 
of  nodes  in  its  subtree),  define  a  weight- balanced  tree  with  balance  factor  a  to  be  a  binary 
tree  in  which  each  node  n  has  a  weight  w(n)  which  satisfies  the  following  constraints: 

1.  [Positivity.]  The  weight  u>(n)  >  0  for  all  nodes  n. 

2.  [Additivity.]  If  n  is  the  parent  of  nodes  «i  and  r*a,  then  to(n)  =  to(ni)  +  to(ni). 

3.  [Balance.]  If  n  is  the  parent  of  node  »',  then  w(n')  >  ato(n). 

The  additivity  condition  implies  that  the  weight  of  any  node  is  simply  the  sum  of  the 
weights  of  the  leaves  in  the  subtree  rooted  at  n,  and  in  particular  the  weight  of  the  root 
is  the  total  weight  W  of  all  the  leaves  in  the  tree.  Another  way  of  stating  the  balance 
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condition  is  that  the  ratio  p(ni)  =  tc(ni)/«?(ns)  of  the  weights  of  node  i»i  and  its  sibling 
nt  satisfies  a/(l  —  a)  <  p(ni)  <  (1  —  a)/a. 

We  shall  extend  the  definition  of  a  weight- balanced  tree  to  allow  heavy  item  nodes. 
Unlike  the  ease  of  biased  2*3  trees,  items  will  be  allowed  to  appear  in  internal  nodes.  This 
blurs  the  distinction  between  data- carrying  nodes  and  bookkeeping  nodes  and  makes  the 
algorithms  somewhat  more  complicated;  this  is  required  simply  because  a  binary  tree  is 
not  as  flexible  as  a  2-3  tree. 

A  biased  weight-  balanced  tree  is  a  binary  tree  with  two  kinds  of  nodes,  called  item 
nodes  and  ooa-item  nodes.  Each  item  node  stores  one  item,  and  they  are  arranged  in 
the  tree  so  that  listing  the  item  nodes  in  symmetric  order  gives  a  list  of  nodes  sorted  by 
ascending  key  value.  Item  nodes  may  have  tero,  one,  or  two  children.  A  non-item  node  is 
amply  a  binary  internal  node;  it  has  exactly  two  children. 

A  leaf  must  be  an  item  node,  but  not  all  item  nodes  are  leaves.  Nodes  whose  parents 
are  item  nodes  play  a  special  role  in  the  algorithms  that  manipulate  biased  weight-balanced 
trees.  Such  nodes  are  intuitively  undesirable  because  they  make  the  tree  more  complicated, 
yet  they  are  necessary  to  maintain  proper  balance,  so  we  give  them  a  name. 

Definition  3.1.  A  subitem  node  is  a  node  whose  parent  is  an  item  node.  A  normal  node 
is  a  node  whose  parent  is  a  non-item  node. 

Definition  3.2.  The  following  fuctions  are  defined  on  nodes  in  a  biased  weight-balanced 
tree: 

•  The  mass  m(n).  If  n  is  an  item  node  storing  item  /,  then  m(n)  =  W(I),  the  weight  of 
the  item.  If  n  is  a  non-item  node  it  has  no  mass,  that  is  m(n)  =  0. 

•  The  weight  to(n).  If  n  is  a  leaf,  then  to(n)  =  m(n).  The  weight  of  internal  nodes  is 
derived  from  the  additivity  condition,  stated  below  in  Definition  3.3. 

•  The  balance  P{n).  Let  p  be  the  parent  of  n.  If  n  is  a  normal  node  then  /J(»)  = 
w(n)/to(p).  Otherwise  if  n  is  a  subitem  node  then  0(n)  —  w(n)/m(p). 

•  The  ratio  p(n).  If  n'  is  the  sibling  of  n,  then  p(n)  =  w(n)/to(nr). 

•  The  degree  d(n).  If  n  is  an  item  node,  then  d(n)  is  the  number  of  children  it  has.  (We 
could  also  define  d(n)  =  2  for  non-item  nodes.) 

When  no  confusion  can  occur,  we  will  drop  the  parentheses  from  the  balance  and  ratio 
functions,  writing  fin  and  pn  for  (3{n)  and  p(n). 

In  the  algorithms  and  proofs  of  this  chapter,  we  will  usually  use  the  name  of  a  node  n 
to  stand  for  its  weight  te(n)  in  contexts  where  a  weight  is  expected  (we  wilt  always  indicate 
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Figure  3.1. 

A  biased  weight- balanced  tree. 


the  mass  of  a  node  explicitly).  If  n  is  an  item  node,  then  V  will  mean  t»(n),  which  is 
larger  than  m(n)  if  n  has  children.  In  contexts  where  a  tree  is  expected,  V  will  mean  the 
subtree  rooted  at  n. 

Figure  3.1  shows  a  biased  weight-balanced  tree.  Squares  represent  item  nodes  and 
circles  represent  non-item  nodes.  The  number  inside  an  item  node  is  its  mass;  the  number 
over  a  node  is  its  balance.  Non-item  node  a  has  weight  25.  Non-item  node  b  has  weight  5 
and  ratio  5/20  =  1/4.  Its  sibling  is  an  item  node  with  weight  20  and  mass  11. 

Definition  3.3.  A  biased  weight- balanced  tree  is  in  the  class  BWB[a,  a']  if  it  satisfies  the 
following  three  conditions: 

1.  [Positivity.]  The  weight  t»(»)  >  0  for  all  nodes  ». 

2.  [ Additivity .]  If  n  is  the  parent  of  t*i  and  nj,  then  to(n)  =  m(n)  +  tp(nj)  +  t»(na). 

3.  [Balance.]  If  n  is  a  normal  node  then  0(n)  >  a.  Otherwise,  if  n  is  a  subitem  node 
then  P(n)  <  o'. 

Here  a  and  o'  are  real  numbers;  we  will  determine  feasible  and  desirable  values  for  a  and 
a'  in  Section  3.9.  The  positivity  condition  is  included  for  similarity  with  the  definition  of 
weight-balanced  trees;  it  can  be  derived  from  the  definitions  of  weight  and  mass  and  from 
the  additivity  condition.  A  node  satisfying  the  balance  condition  is  called  balanced.  Thus 
the  tree  of  Figure  3.1  is  in  the  class  BWB[l/5, 5/11]. 

By  the  additivity  condition,  the  weight  of  any  node  n  is  simply  the  sum  of  the  weights 
of  all  the  items  stored  in  the  subtree  rooted  at  n.  In  particular,  the  weight  of  the  root  is  the 
total  weight  W  of  all  the  items  stored  in  the  tree.  We  allow  item  nodes  to  have  children, 
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but  only  if  these  children  are  not  too  heavy  compared  with  the  item;  this  is  the  import  of 

the  second  part  of  the  balance  condition. 

«•  . 

Several  consequences  of  the  balance  and  additivity  conditions  deserve  to  be  mentioned. 
The  next  lemma  gives  us  four  ways  to  prove  a  normal  node  »i  is  balanced. 

Lemma  3.4.  If  n  is  a  non-item  node  with  children  ni  and  n2,  then  the  following  are 
equivalent: 

(a)  0(ni)  >  a, 

(b)  0(n2)  <  1  -  a, 

(c)  />(* l)  >  «/(l  -  «)» 

(d)  p(«3)  <  (1  -  a)/ a. 

Proof. 

(o)  =►  ( b ):  If  »i/n  >  a,  then  n3  =  n  —  «i  <  n  —  an,  so  n3/n  <  1  —  a. 

(6)  =»  (c):  If  n2/n  <  1  —  a,  then  ni/n2  =  (n  -  n2)/n3  >  1/(1  —  a)  —  1  =  a/(l  —  a). 

(c)  =>  (d):  This  is  immediate  because  p{n^)  —  l/p(ni). 

(d)  =>  (o):  If  n2/ni  <  (l-a)/a,  then  nt/n  =  ((ni+n3)/ni)_1  >  (1+ (l-a)/a)-1  =  a. 

The  following  lemma  relates  the  weight  of  an  item  node  to  its  mass  and  to  the  weights 
of  its  children. 

Lemma  3.5.  If  n  is  an  item  node  with  children  (subitem  nodes)  ni  and  n3(  which  may 
be  null,  then 


(a) 

»i  < 

a' 

1  +  o/n' 

(b) 

t»i  +  n3  < 

2a' 

1  +  2a^n> 

(c) 

m(n)  > 

1 

1  +  dHa*  ’ 

(d) 

m(n)  > 

i +J'  "'>■ 

Proof.  Since  each  child  of  n  has  weight  at  most  a'm(n),  Case  (a)  is  true  because 
ni  <  a'm(n)  =  a'(n  —  r»i  -  n3)  <  a'(n  —  »i). 


J 
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Case  (b)  is  true  because 


ni  +  na  <  2a'm(n)  =  2a' (n  —  (»i  +  na)). 


Case  (c)  is  true  because 


m(n)  >  n  —  d(n)a'm(n). 


Similarly,  Case  4  is  true  because 


m(n)  =  n  —  «i  —  ns  >  n  —  «i  —  a'm(n). 


3.2.  More  about  entropy. 

Given  a  list  of  weights  w  =  . . .  ,wk)  with  total  weight  W  =  recall 

that  we  defined  the  entropy  of  the  list  by  the  formula 

H{v>i . wk)=  JZ  S1*—-  (3*1) 

i  <i^k  w 

The  following  lemma  describes  what  happens  to  the  entropy  when  we  concatenate  two  liBts 
of  weights. 

Lemma  3.6.  Suppose  two  lists  of  weights  wj  =  (toj, . . . ,  wm)  and  wa  =  . . . ,  wn) 

are  given.  Let  w  =  (wi wm,  wm+i w„)  be  the  concatenation  of  the  two  lists,  and 
let  Wt  —  Di and  W  —  £i< i£nw<  be  the  10141  weights 
of  the  various  lists.  Then 

Wtf(w)  =  WtH( wx)  +  W2H{ wa)  4-  WH{Wlt W9). 

Proof.  The  proof  is  a  straightforward  calculation: 

WtH(wi)  +  W,H(w2)  =  -(wi  E  S-lg  W  +  Wi  ^  w) 

v.is™ w  ww'  -S<. "  ^  w*> 

=  -W(  Y  S|»!S  +  ^«l  +  «4lf  J'l'i 

V./i  W  8  W  W  8  Wi  W  *W%) 

=  WH{w)-WH(WuWt). 
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As  a  Dotations!  convenience,  define 


Hm  =  H(a,\-a) 


for  real  numbers  a  such  that  0  <  a  <  1.  As  a  function  of  a  in  this  range,  Ha  is  symmetric 
around  a  =  1/2  and  convex,  since 


-1 

o(l  —  a)  In  2 


<0, 


so 

0  —  Hq  —  H\  <  Ha  <  H\/%  =  1,  if  0  <  a  <  1. 

Next  we  bound  the  entropy  of  balanced  siblings  in  a  biased  weight-balanced  tree. 

Lemma  3.7.  Let  r»i  and  ns  be  siblings  in  a  biased  weight-balanced  tree.  If  ni  and  ns  are 
both  balanced,  then  H(ni,nt)  >  Ha. 

Proof.  By  definition, 


H(ni,nj) 


—  lg  —  +  —  lg  — 

»  »i  n  ns 

-(*lg*  +  (l-*)lg(l-*)), 


where  x  =  (nj/n).  The  lemma  follows  because  a  <  x  <  1  —  a,  and  because  H  is  convex 
and  symmetric  around  x  =  1/2. 


3.3.  Path  length. 

Our  goal  in  defining  biased  weight- balanced  trees  is  to  have  the  length  of  the  path  in  a 
tree  of  total  weight  W  from  the  root  to  an  item  node  of  mass  w  proportional  to  logfW/tff). 
The  following  proposition  shows  we  achieve  this  goal. 


Proposition  3.8.  Let  7  be  a  tree  in  the  class  BWB(a,a'],  with  total  weight  W.  The 
length  of  the  path  from  the  root  to  an  item  node  with  mass  w  is  at  most  [l°g„(VVyto)], 
where 

* = mh‘<r^  — >  >  >• 

Proof.  At  each  step  along  the  path  from  n  to  the  root,  the  weight  goes  up  either  by  a  factor 
of  1/(1  —  a)  at  normal  nodes,  by  Lemma  3.4(b),  or  by  a  factor  of  (1  +  a')/ a'  at  subitem 
nodes,  by  Lemma  3.5(a).  Since  we  start  with  weight  >  w  and  end  with  weight  W,  the  path 
can  have  length  at  most  flog(C{  W^/to)"]. 

However  we  can  say  even  more  about  the  weighted  average,  taken  over  all  items,  of 
the  path  lengths. 

Theorem  3.9.  Let  7  be  a  tree  in  the  class  BWB[a, a'].  Let  w  =  (toj, . . . ,  w *)  be  the  list 
of  item  weights  in  7,  and  let  W  =  w*  b*  the  total  weight.  Then  the  total  path  length 
L  =  L(T)  satisfies 

L  <  +  2 a'W.  (3.2) 

Proof.  The  proof  is  by  induction  on  the  structure  or  the  tree.  The  base  case  is  that  of 
a  single  item  node.  In  this  case  the  path  length  L  —  0,  whereas  the  right  side  of  (3.2)  is 
always  positive. 

Now  given  a  non-trivial  tree,  let  Tx  and  72  be  its  left  and  right  subtrees,  with  weight 
lists  wi  and  wj,  total  weights  Wx  and  W- j,  and  total  path  lengths  L\  and  L%.  There  are 
two  cases,  depending  on  whether  the  root  is  an  item  node  or  a  non-item  node. 

If  the  root  is  a  non-item  node,  then  W  =  Wx  +  W2.  Applying  the  theorem  inductively, 
we  find  that 

Li  <  -j-Wi/ffw,)  +  2a'Wi  and  Z*  <  -£-W2H(vra)  +  2a'Ws. 

ZJq  /2q 

The  path  to  any  item  in  7  can  be  broken  down  into  a  path  within  the  appropriate  subtree, 
preceded  by  one  step  from  the  root  of  7  to  the  root  of  the  subtree;  thus  the  total  path 
length  of  7  is  simply  the  sum  of  the  total  path  lengths  of  the  two  subtrees,  plus  the  weighted 
sum  of  1  for  each  item.  That  is 

L  =  L\  +  Z/2  +  1 '  wi 

=  Li+U  +  W 

<  w,)  +  W2H{ w3))  +  (1  +  2 a')W. 
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Applying  ihe  entropy  lemma  (Lemma  3.6),  we  find  that 

L  <  ±-WH{ w)  +  (1  +  2 a'  -  -±-H(Wu  Wn))W. 

"ft  "ft 

The  tree  is  balanced,  so  H(Wi,W2)  >  H*  by  Lemma  3.7,  and  we  conclude  that 

L<  -5-W'tf(w)  +  2a,H'. 

"ft 

If  the  root  r  is  an  item  node,  then  the  total  weight  W  =  tyj  +  m(r)  +  Wt,  although 
other  subtree  might  be  empty.  As  in  the  case  of  non-item  nodes,  the  total  path  length  is 
the  sum  of  the  path  lengths  in  the  two  subtrees,  plus  1  for  each  (weighted)  item  except  the 
item  at  the  root,  that  is 

L  =  Li  + 1*  +  Wi  +  W2. 

If  we  let  w/  be  the  list  of  weights  of  all  items  in  T  except  the  root  item  and  apply  the 
theorem  inductively,  we  find  that 

L  <  w,)  +  W2//(wa))  +  (1  +  2a'XH'i  +  Wt) 

=  +  W,)H(wO  -  {Wi  +  Wt)H{WltW2))  +  (1  +  2a’XW,  +  Wt) 

"ft 

=  ^(w//(w)  -  WH(Wt  +  W2,  m(r))  -  m(r)//(m(r)) 

-  (Wt  +  Wt)H(WifWt)j  +  (1  +  2a')(W  +  W2) 

1  O  fj 

=  4-WH{w)  +  So'H', 

"ft 

by  Lemma  3.6  (twice),  the  positivity  of  H ,  and  Lemma  3.5(b). 

To  compare  this  theorem  to  Proposition  3.8,  suppose  all  the  paths  were  as  long  as  the 
bound  given  there.  Then  the  total  path  length  would  be  £  log^fW/wj)  =  W  f/(w)/  lgic. 
Thus  the  average  path  is  much  shorter  than  the  worst  case,  since 

IgA  <  -lg(l-o)  <  Ha, 


for  a  <  1/2. 
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Figure  3.2. 

Two  biased  weight-balanced  trees  about  to  be  joined. 


3.4.  The  JOIN  operation. 

As  in  the  case  of  biased  2-3  trees,  all  the  operations  can  be  defined  in  terms  of  the 
join  and  split  operations. 

The  algorithm  to  join  two  biased  weight- balanced  trees  is  similar  in  spirit  to  the 
algorithm  to  insert  a  new  node  in  a  weight- balanced  tree.  The  basic  idea  is  to  slide  the 
smaller  tree  down  one  side  of  the  larger  until  we  find  the  “level”  where  the  smaller  tree 
belongs,  according  to  its  weight.  The  addition  of  more  weight  within  the  larger  tree  may 
require  us  to  rotate  nodes  along  the  path  of  insertion,  or  to  add  new  non-item  nodes. 

The  new  feature  of  this  algorithm  is  its  accomodation  of  heavy  item  nodes.  We  have 
allowed  light  nodes  to  exist  near  a  heavy  item  node  as  its  children,  unlike  weight-balanced 
trees  in  which  nearby  nodes  have  roughly  equal  weights.  But  if  we  add  enough  weight  to 
one  of  these  children,  it  should  move  out  from  under  the  heavy  item  to  participate  in  the 
tree  on  its  own  merit. 

Here  follow  the  details  of  the  algorithm  to  join  two  biased  weight-balanced  trees.  We 
assume  that  the  lighter  tree  is  to  be  JoiNed  to  the  right  of  the  heavier  tree,  as  the  opposite 
case  is  completely  symmetric.  In  Figure  3.2  we  have  labeled  the  topmost  nodes  of  the 
heavier  tree  a,b,e,d,e,  f,g,  and  the  root  of  the  lighter  tree  z.  Recall  that  we  use  the  name 
of  a  node  to  refer  to  its  weight  as  well;  for  example  we  express  the  condition  that  the  left 
tree  is  heavier  by  saying  b  >  z,  instead  of  to(6)  >  w(z).  Only  the  ratios  of  the  weights 
matter,  so  by  rescaling  we  will  assume  that  6=1,  and  therefore  that  z  <  1. 

Algorithm,  join  two  biased  weight-balanced  trees. 
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VigoN  1.1. 
Cam  1,  >  ii  heavy. 


Figure  3.4. 

Case  2a,  6  is  an  item  node  and  s'  is  heavy. 


Input:  Two  trees  5  and  T,  with  weights  b  and  s  respectively. 

Preconditions:  S  precedes  T,  b>  *. 

Output:  One  tree  R. 

Postconditions:  R  stores  the  items  of  S  and  T  in  key  order. 

There  are  seven  cases,  according  to  the  weights  of  the  nodes  and  the  configuration  of  the 
heavier  tree.  The  algorithm  simply  uses  the  first  case  that  applies.  We  assume  that  a  and 
of  satisfy  certain  inequalities  that  will  be  discussed  later. 

Case  I.  [s  is  heavy.]  If  *  >  a/(  1  —  a),  create  a  new  node  r  as  the  root  of  the  new  tree, 
and  attach  b  and  *  as  its  left  and  right  children.  See  Figure  3.3. 

Case  2.  [3  is  an  item  node.]  (At  this  point,  *  <  o/(l  —  a).)  If  6  is  an  item  node,  first 
detach  /  from  b  and  recursively  join  /  and  s  to  form  s'.  (Of  course,  this  step  is 
omitted  if  node  /  is  null.)  Then  distinguish  two  cases: 


a)  If  s'  >  a(l  —  /)/( 1  -  a),  then  attach  b  and  s'  as  children  of  a  new  root  r.  See 
Figure  3.4. 
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Figure  3.5. 

Case  2b,  b  is  an  item  node  and  s'  is  light. 


Figure  3.5. 
Case  3,  a  is  heavy. 


b)  Otherwise  attach  s'  as  the  right  child  of  b  and  let  r  —  b.  See  Figure  3.5. 

Case  3.  [a  is  heavy.]  (At  this  point,  s  <  a/(l  -  a),  and  aba  normal  node.)  If  a  > 
af{  1  —  a),  recursively  join  /  and  *  to  form  s',  and  attach  s'  as  the  new  right 
child  of  b.  See  Figure  3.6. 

Case  4.  [/  b  an  item  node.]  (At  this  point,  s  <  a/(i  —  a),  a  is  normal,  and  a  <  a/(l — a).) 
If  /  is  an  item  node,  first  detach  d  from  /  and  recursively  join  a  and  d  to  form  a'. 
(If  d  is  null  then  a'  =  a.)  Then  dbtingubh  two  eases: 
a)  If  a'  >  a/(l  -  a),  then  attach  a!  as  the  left  child  of  the  root  b  and  go  to  Case 
3,  which  will  apply  directly.  See  Figure  3.7. 
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Figure  3.8. 

Case  4b,  o'  does  not  balance  item  node  /. 


Figure  3.9. 

Case  5,  p  ia  heavy  (single  rotation). 


b)  Otherwise  if  a!  <  a/(l  —  a),  attach  o'  as  the  left  child  of  /  (which  now 
becomes  the  root),  discard  the  old  root  b,  and  go  to  Case  2,  which  will  apply 
directly.  See  Figure  3.8. 


Case  S.  (p  is  heavy.]  (At  this  point,  *  <  a/{  1  —  a),  a  is  normal,  a  <  o/(  1  —  a),  and  p  is 
normal.)  If  p  >  a,  do  a  "single  rotation",  that  is  attach  a  and  d  as  the  children 
of  6,  and  attach  b  and  g  as  the  children  of  the  new  root  /.  Then  go  to  Case  3, 
which  will  apply  directly.  See  Figure  3.9. 


Case  6.  [d  ia  an  item  node.]  (At  this  point,  i  <  a/(l  —  a),  a  is  normal,  a  <  a/(l  —  a), 
g  is  normal,  and  g  <  a.)  If  d  is  an  item  node,  first  detach  e  and  e  from  d  and 
recursively  join  a  and  e  to  form  o',  and  recursively  join  e  and  g  to  form  rf.  Then 
attach  d  and  p'  as  the  children  of  /,  and  attach  o'  and  /  as  the  children  of  6.  See 


Figure  3.11. 

Case  6c,  rf  is  a  heavy  item  node. 


Figure  3.10.  Now  distinguish  several  cases: 

a)  If  a'  >  a/(l  —  a)  then  go  to  Case  3,  which  will  apply  directly. 

b)  If  o'  <  o/(l  —  a)  and  g'  >  a,  then  go  to  Case  5,  which  will  apply  directly. 

c)  Otherwise  if  a'  <  a/(l  —  a)  and  g'  <  a,  then  reattach  gf  as  the  right  child 
of  d,  discard  /,  attach  d  as  the  right  child  of  the  root  b,  and  go  to  Case  4, 
which  will  apply  directly.  See  Figure  3.11. 


Case  7.  [All  other  cases.]  (At  this  point,  s  <  o/(l  -  a),  a  is  normal,  a  <  a/(l  —  a),  g  is 
normal,  g  <  a,  and  d  is  a  non-item  node.)  If  none  of  the  preceding  cases  apply, 
then  do  a  "double  rotation”,  that  is,  attach  o  and  e  as  children  of  6,  attach  e  and 
g  as  children  of  /,  and  attach  6  and  /  as  children  of  the  new  root  d.  Then  go  to 
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Figure  3.12. 

Case  7,  double  rotation. 


Case  3,  which  will  apply  directly.  See  Figure  3.12. 


3.5.  Correctness  of  the  JOIN  algorithm. 

In  proving  the  correctness  of  the  JOIN  algorithm,  we  assume  several  relations  between 
a  and  o'.  Afterwards  we  will  collect  these  relations  and  determine  feasible  and  desirable 
values  for  these  parameters. 

In  proving  that  the  transformed  trees  are  balanced,  two  simple  arguments  occur 
frequently.  If  we  add  mass  to  the  subtree  rooted  at  some  normal  node  »,  or  if  we  remove 
mass  from  the  subtree  rooted  at  its  sibling  the  ratio  p(n)  increases,  so  node  n  remains 
balanced.  In  this  event  we  will  say  that  node  n  balances  incrementally.  Or  it  may  happen 
that  after  a  transformation  the  subtree  of  n  has  less  mass  than  the  subtree  of  some  different 
node  «»i ,  while  the  subtree  of  n'  has  more  mass  than  the  subtree  of  n» ’s  sibling.  In  this  event, 
p(ni)  >  p( n),  so  iii  must  be  balanced;  we  will  say  that  node  ni  balances  incrementally  over 
n,  to  emphasise  which  nodes  are  being  compared.  It  is  possible  that  n  will  be  a  descendant 
of  t»i  after  the  transformation. 

Theorem  3.10.  The  join  algorithm  is  correct. 

Proof.  The  new  tree  R  is  easily  seen  to  store  the  items  of  S  and  T  in  the  correct  order.  It 
suffices  to  prove  that  R  is  balanced.  Lemma  3.4,  which  gave  us  four  ways  to  prove  a  node 
balanced,  is  the  main  tool.  We  use  it  tacitly  throughout  the  proof. 
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Case  1.  (See  Figure  3.3.)  Node  6  is  balanced  because 


1  +  *  “ 


which  is  true  assuming 

«  <  (3.3) 

Node  *  is  balanced  because  pz  —  z  >  a/(l  —  a). 

Case  2.  The  left  tree  is  still  balanced  after  removing  node  /.  Suppose  z1  >  a(l  —  /)/( 1  —  a), 
so  Case  2a  applies  (Figure  3.4).  Then  z1  is  balanced  because  pz'  =  *7(1  —  /)  >  o/(l  —  a). 
Also,  by  Lemma  3.5(a),  b  is  balanced  because 


Pb  = 


1  ~/ 
1  +  * 


> 


l-oVU  +  qQ 
1  +  a/(l  —  a) 


1  —  a 
1  +  a/ 


>  o. 


which  is  true  assuming 

1  >  a 
l  +  o*  ~1 -a' 


(3.4) 


Otherwise  in  Case  2b  (Figure  3.5),  z1  balances  under  b  because  by  Lemma  3.5(c)  we  have 


zf 

m(6) 


< 


p(l  -  /)/(!  ~  0»)  _  tt(l  4-  a/)  ^  , 
(1  -/)/(! +«0  1-a  -  ’ 


which  is  true  assuming 


1  +  a* 


a 

1-a' 


(3.5) 


Case  3.  (See  Figure  3.6.)  Node  z*  balances  incrementally  over  /.  Node  a  balances  because 


a 

1  +  z 


> 


«/(l  -  «) 

1  +  a/(i  -  a) 


a. 
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j.  Before  the  forming  of  a',  we  knew  by  Lemma  3.5(a)  that 


<*<  -  5-^(1  — )• 


“  l  +  tt''  1  +  o'’ 

In  Case  4a  (Figure  3.7),  o'  balances  incrementally  and  /  balances  because 


/J/  =  /=l-(fl  +  d)>l-o— ^(l-o) 


1-a  ^  (1  —  2a)/(l  —  a)  ^ 

-  ft* —  - 


which  is  true  assuming 


1  —  2a 


^  1  +  a'. 


a(l  -  a) 

Also,  Case  3  will  apply  by  construction  since  a!  >  a/(  1  —  a). 

In  Case  4b  (Figure  3.8),  a'  balances  under  /  because  by  Lemma  3.5(d), 


(3.6) 


/Jo'  = 


a(l  +  o')  , 

<  t — sr-  S  «» 


m(/)  ”  (1  -  «')/(!  +  o')  1  -  2a 


which  is  true  assuming 


of  a 

> 


(J.7) 


l  +  o"  -  1-Ja' 

Also,  the  new  root  /  is  an  item  node,  so  Case  2  will  apply. 

Case  5.  (See  Figure  3.9.)  Node  g  is  balanced  by  construction,  since  0g  —  g  >  a.  Node  b  is 
balanced  incrementally  over  o.  Node  o  is  also  balanced  incrementally.  Node  d  is  balanced 
because  in  the  original  tree,  before  the  transformation,  d  >  af  —  a(l  —  a),  so  after  the 
rotation, 


1-a’ 


which  is  true  assuming 


1  —  2a  > 


(3.8) 


"  1-a 

Also,  Case  3  will  apply  because,  in  the  original  tree,  g  <  (1  -  a)/  <  (1  —  a)*,  so  that  after 
the  rotation 

*3  > 

a' 


*>1  -(»-«)*>  yf; 


2a  —  a*  > 


~  1  —  a 


(3.9) 


which  is  true  assuming 
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Caae  8.  First  we  must  check  that  the  original  transformation  preserves  balance  (see  Figure 
3.10).  Both  a'  and  balance  incrementally.  Before  the  transformation  we  have 


*  a  1  —  3a  4-  a2 


and  thus  by  Lemma  3.5(c), 


m(d)  > 


1  -  3a  +  a2 
(l-a)(l  +  2a')‘ 


(3.10) 


After  the  transformation  d  is  balanced  because 


e  +  g  pe  +  g/m[d) 


a'  4-  a(l  -  a)(l  4-  2a')/(l  -  3a  4-  a2) 

_  1  —  3a  4-  a2 

~~  a(l  -  a)  4-  a'(l  -  a  -  a2) 

> 

“  1  -a* 


which  is  true  assuming 


1  —  3a  4-  a2  ^  a 

a(l  —  a)  4-  a'(l  —  a  —  a2)  “  1  —  a 


(3.11) 


Furthermore,  since  c  was  balanced  before  the  transformation,  by  Lemma  3.5(a)  we  have 


C  -  TTa’d  ~  ITa'*1  “  “)C  _  °)’ 


so  /  is  balanced  after  the  transformation  because 


wl4-aa\ 

,1  -  2aw  1  4-  aa'. 


1.5.  CORRECTNESS  OP  THE  JOIN  ALGORITHM 


which  is  true  assuming 


1  +  aa'  .  a(l  —  a) 
1  +  a?  ~  1  —  2a 


(3.12) 


All  we  have  to  check  in  Cases  6a  and  6b  is  that  Cases  3  and  5,  respectively,  apply.  But 
this  is  true  by  construction.  Finally  in  Case  6c  (Figure  3.11),  the  right  child  of  the  root 
is  an  item  node,  so  Case  4  will  apply;  we  need  only  show  the  new  tree  is  balanced.  But 
after  this  transformation  a '  is  unchanged,  so  nodes  a '  and  d  still  balance.  Node  g*  balances 
because 

/ «  -  ,  at  1  —  3a  +  a8 

—  - - , 


RfJ  _  jf_  <*(1  ~  °»)  . 

m(d)  1  —  3a  +  a8  ~  ’ 


which  is  true  assuming 


a'  >  “t1-*) 

1  —  3a  +  a8 


(3.13) 


Case  7.  (See  Figure  3.12.)  Nodes  a,  b,  and  g  balance  incrementally  over  a,  a,  and  g 
respectively.  Before  the  double  rotation,  we  had 

d  =  1  -  (a  +  g)  >  l -(__  +  «)=  — — - — . 

Node  c  balances  afterwards  because 

c  ^  ad  v  ,  „  o  .  a 

pc  =  -  >  —jr- - r  >  1  -  3a  +  a8  >  - - , 

a  c»/(l  —  a)  1  —  a 


which  is  true  assuming 


Node  e  balances  because 


1  —  3a  +  a8  >  . 

—  1  —  a 


e  ad  ^  1  -  3a  +  a8  a 

<***>>T-  i-,  >—«• 


(3.14) 


which  is  true  assuming 


1  -  3a  +  a8  >  a. 


(3.151 
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Node  /  balances  because,  before  the  rotation, 


so  afterwards 


c  <  (1  -  a)8/  =  (1  -  a)8(l  -  o); 

/*/  =  /  =  l-(o  +  *)>  l-(a  +  (l-a)2(l-o)) 

=  (1  —  a)(2a  —  a3) 

1  -  2a  .  , 

>  Tri-“(2-“) 


which  is  true  assuming 


(1  -  2a)(2  -  a)  >  1  -  a. 


(3.16) 


Finally,  Case  3  will  now  apply  because 


,  ^  ,  1  —  3a  +  a8 

b  =  a  +  c>a  +  ad>  a(l  H - - - ) 

~  1  —  a 


1  —  a1 

>  -2_ 
-  1  —  a’ 


(2  -  4a  +  a3) 


which  is  true  assuming 


2  -  4a  +  a8  >  1. 


(3.17) 


3.6.  Running  time  of  the  JOIN  algorithm. 

As  in  the  case  of  biased  2-3  trees,  the  running  time  of  the  join  algorithm  for  biased 
weight-balanced  trees  will  be  analysed  by  using  poker  chips  to  account  for  the  cost  of 
elementary  operations.  We  will  also  leave  chips  in  the  tree  to  be  used  by  later  operations, 
allowing  us  to  amortise  the  cost  over  a  sequence  of  operations. 

Let 

fl.  if  1  <  x  < 
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This  is  the  running-time  function,  defined  for  x  >  1;  in  other  words,  r(z)  is  the  number  of 
chips  the  cashier  allots  to  a  join  of  two  trees  S  and  T  whose  weight  ratio  W(S)/W(T)  is 
x,  assuming  W($)  >  W(T).  Clearly  r(x)  =  0(log(z));  in  fact 


T(*)  =  j^lg*+0(l), 


(3.19) 


so  we  have  solved  the  discretisation  problem  that  was  the  one  technical  flaw  in  biased 
2-3  trees  (see  Section  2.5).  Biased  weight- balanced  trees  achieve  “true”  logarithmic  perfor¬ 
mance. 

The  constant  p  is  some  number  greater  than  1.  We  obtain  the  best  bound  on  the 
running  time  by  choosing  p  as  large  as  possible,  however  the  choice  of  p  is  constrained  by 
the  choices  of  a  and  a'.  These  choices  will  be  discussed  in  Section  3.9. 

We  use  the  notation 

r(x,y)  =  max(r(z/y),r(p/z)) 


to  denote  the  running  time  of  a  join  of  two  trees  of  weights  z  and  y,  when  we  do  not  know 
a  priori  which  tree  is  heavier;  we  also  let  r(z,0)  =  r(0,  z)  =  0,  since  join  costs  nothing 
when  one  operand  is  empty. 

Since  we  must  amortise  the  running  time,  we  will  need  another  function 


x(*)  = 


1, 

3fl°8'(r^w1+1' 


if  1  >  x  > 
. »  or 
(1  -  «)p 


(1  -  o)p’ 

>  *  >  0; 


(3.20) 


defined  for  0  <  z  <  1.  (Actually,  we  only  use  x  for  z  <  a'.)  This  function  plays  a  role 
analogous  to  casts  in  a  biased  2-3  tree.  We  say  a  tree  is  completely  cast  if  it  satisfies  the 
following  invariant. 


Chip  Invariant.  Any  subitem  node  n  with  balance  fin  has  at  least  x(/9n)  chips  piled  on 


BIASBD  WEIGHT-BALANCED  TREES 


Table  3.1. 

Some  values  of  r  and  x- 


it. 

Table  3.1  shows  the  first  few  values  of  r  and  X-  The  next  lemma  establishes  some  of 
their  elementary  properties. 

Lemma  3.11.  As  z  increases,  r(z)  increases  and  x(z)  decreases.  Furthermore, 


M 

t(*)  >  T[x/q)  +  3, 

ifp<  f  and  z  >  — 

(b) 

x{*/q)  >  xix)  +  3, 

if  p  <  9  and  z  <  — 

(c) 

t(x)  =x(1/*)  +  3, 

•r  ^  1  “  a 
a 

(d) 

r(x)  <  x(l/*)  +  3, 

if  z  >  1. 

(e) 

x [—)  >  *{*), 

qx 

if  P  9  an<i  x  >  1. 

Proof.  Monotonicity  and  properties  (a)-(c)  follow  immediately  from  Equation  (3.19)  and 
Equation  (3.20).  Property  (d)  follows  from  (c),  since  r(z)  =  x(l/x)  for  1  <  z  <  (1  —a)/a. 
For  property  (e)  there  are  two  cases.  If  z  <  (1  —  o)/o,  then  t(z)  =  1  <  x(l/(9*))>  If 
z  >  (1  —  o)/o,  then  x(l/(?*))  =  r(gz)  —  3  >  t(z),  by  properties  (c)  and  (a). 

Our  goal  in  this  section  is  to  prove  the  following  theorem. 
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Theorem  3.12.  The  join  algorithm  uses  t(W(S),  W{T))  chips  to  join  S  and  T,  leaving  the 
result  completely  cast,  assuming  both  S  and  T  were  completely  cast,  provided  that  certain 
relations  among  a,  o',  and  p  hold.  (These  relations  will  be  discussed  in  Section  3.9.) 

Proof.  Under  the  assumptions  of  the  algorithm,  the  weight  ratio  x  is  1/a,  and  we  must 
show  that  r(z)  chips  suffice.  Before  we  begin  the  case  analysis,  a  word  about  overhead  and 
certain  subitem  nodes.  The  algorithm  factors  rather  fortuitously  into  the  “odd”  cases  (1, 
3,  5,  and  7),  and  the  “even”  cases  (2,  4,  and  6).  In  the  odd  cases,  we  do  a  bounded  amount 
of  work  rotating  S  before  finally  doing  useful  work  in  Case  1  or  3.  One  chip  will  pay  for 
all  this  work  (except  the  recursive  calls);  in  other  words,  the  overhead  for  Cases  5  and  7  is 
paid  by  Case  3. 

The  even  cases  are  a  little  more  complicated.  They  do  a  bounded  amount  of  rebalancing 
(modulo  recursive  calls)  before  ending  up  in  Case  2  or  3.  However,  the  chips  needed  to 
satisfy  the  chip  invariant  on  node  a'  in  Case  4b  and  on  node  in  Case  6c  must  come  out 
of  the  supply  from  the  cashier.  Assuming  that  one  chip  apiece  is  needed  for  these  nodes, 
by  the  time  we  get  to  Case  2  or  3  our  supply  of  chips  has  possibly  shrunk  from  r(z)  to 
t(x)  —  2.  So  the  overhead  for  the  even  cases  is  pud  by  Case  2  or  3,  including  two  extra 
chips  for  casting  a '  and  gf .  Of  course,  we  must  prove  that  one  chip  apiece  is  enough  to  cast 
them. 

Now  we  will  go  through  the  join  algorithm  case  by  case  to  prove  each  case  has  enough 
chips,  assuming  certain  relations  among  a,  a and  p.  For  brevity  we  say  that  a  table  (or  a 
line  of  a  table)  is  good  if  the  sum  of  the  “Given”  column  exceeds  the  sum  of  the  “Needed” 
column. 

Case  1. 

Given  Needed 

r(l/z)  from  cashier  1  overhead 

Since  x  =  1  /*  >  1,  the  table  is  good  because  r(z)  >  1  for  z  >  1. 

Case  2. 

Given  Needed 

r(l/z)  from  cashier  1  +  2  overhead 

x(Pf)  on  /  t(/,  z)  recursive  call 

x(Pz!)  to  cast  z1  (if  necessary) 

If  /  >  jr,  then  let  z  =  l/z;  since  s'  >  /,  then  fiz1  >  0f  and  so  x{Pf)  >  x(0*O*  Since 


54  BIASED  WEIGHT-BALANCED  TREES 

/  <  <*7(1  +  a')  by  Lemma  3.5(a),  and  since  z  <  o/(  1  —  a),  the  table  is  good  if 


t(*)  >  3  +  r(— x), 


1  —  a 
for  *  >  - , 


which  is  true  by  Lemma  3.11(a),  assuming 


P  < 


1  +  a' 
o'  ‘ 


(3.21) 


Otherwise  if  z  >  /,  let  x  =  I/2  and  let  y  =  2//.  Since  2  <  o/(l.  —  a)  and  a  <  a'm(6), 
we  have  m(6)  =  1  —  a  —  /  >  1  —  o'm(6)  —  a/(l  —  a),  so 

z  a  (1  -  o)(l  4-  a') 
ro(fc)  1  —  a  1  —  2a 


Since  Pf  —  z/(m(b)y)  and  pzf  =  (/  +  z)J(\  —  «  —  /)>  2  =  1/x,  the  second  line  is  good  if 


X( 


o(l  +  o')l,^  #1 

Trjj-;) a  T<»>- 


for  y  >  1; 


the  table  is  good  if,  in  addition, 


t(x)  >  3  +  x(-)»  for  x  >  - — 
x  a 


The  second  inequality  is  true  by  Lemma  3.11(c);  the  first  only  applies  if  /  >  0  (if  /  =  0 
the  second  line  of  the  table  is  vacuously  good),  and  it  is  true  by  Lemma  3.11(c),  assuming 


P  < 


1  —  2a  1 


a  1  +  o'’ 


(3.22) 
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Given 

r(l/jr)  from  cashier 


Needed 

1  +  2  overhead 
r (/,  a)  recursive  call 


If  /  >  *,  then  let  x  =  l/r,  since  /  =  1  —  a<(l  —  2a)/(l  —  o)  and  *  <  a/(l  —  a),  the 
table  is  good  if 

r(*)  >  3  +  r(ip^*),  for  z  > 

1  —  Or  a 

which  is  true  by  Lemma  3.11(a),  assuming 


l—o 
1  —  2a‘ 


(3.23) 


Otherwise  if  z  >  /,  then  since  *  <  a/(l  —  a),  so  t(1/z)  >  4,  end  since  /  >  a  by  balance, 
the  table  is  good  if 

which  is  true  by  Equation  (3.10),  assuming 


1  ^  l-o 

«  S  "  • 

l  —  o  a 


(3.24) 


Gixsa 

Xifid)  on  d 

1  from  overhead 


r(a,d)  recursive  call 
xiflat)  to  east  of  (if  necessary) 


First  note  that  if  Case  4b  applies,  then  1  >  x(fi*')t  dnce  fia*  —  (a  +  d)/(/  —  d  —  g)  > 
off  m*  pa  >  a/(l  —  a),  so  the  second  line  is  good.  If  d  =  0,  the  first  line  is  vacuously 
good.  Now  if  a  >  d  >  0,  then  let  z  =*  a/d;  since  fid  =*  a/(m(/)z),  and  since  a/m(/)  < 
a(l  +  2a')//  ■«(!  +  2o')/(l  —  a)  <  o(l  +  2o')/(l  —  2a)  by  Lemma  3.3(c),  the  first  line  is 
good  if 

^o(l  +  2a')l,  ^  %  #  ^  . 

^1  -lo  ~)  >  *(*)»  for  x  >  1, 
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which  is  true  by  Lemma  3.11(e),  assuming 


1  —  2a  1 
a  1  +  2  o' ' 


(3.25) 


Otherwise  if  d  >  a,  then  since  a  >  a,  fid  <  a',  and  d  <  a'(l  —  a)/(l  +  a'),  by  balance 
and  Lemma  3.5(a),  the  first  line  is  good  because 


«;>  *  «TT*'-ir » £  ''Hr*  *  ’•Hr'  - 1  s 


Case  5.  This  case  is  pud  for  by  Case  3. 


Case  5. 


Given 

\{pc)  on  e 
X(fie)  on  e 

1  from  overhead 


Needed 

r(a,  c)  recursive  call 

r(e,  g)  recursive  call 

x{Plf)  to  cast  (if  necessary) 


First  note  that  if  Case  6c  applies,  then  1  >  x(/V)»  «nce  fdrf  >  g/m(d)  >  g/d  —  pg  > 
a/(  1  —  a)  by  balance,  so  the  third  line  is  good.  If  c  =  0,  the  first  line  is  vacuously  good. 
Now  if  a  >  e  >  0,  then  let  x  =  o/c;  since  fie  =  a/[m[d)x),  and  since 


o  a  (1  -  a)(l  +  2a*) 

m(d)  1  —  a  1  —  3a  +  a* 


by  Equation  (3.10),  the  first  line  is  good  if 


,  a(l  +  2o')  1,  ^  ,  J 

x'rhrr?;* 5  *>-  tor 


r  x  >  1, 


which  is  true  by  Lemma  3.11(e),  assuming 


1  -  3a  +  qs 


l  +  2a'" 


(3.26) 
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If  e  >  a,  then  since  a  >  a  and  e  < 
first  line  is  good  because 


t(-)  <  t{-^— 

V  “  vl  +  a/ 


a'(l  —  a)3/(l  +  a')  by  balance  and  Lemma  3.5(a),  the 

<  r(~“)  =  1  <  x</fe). 

a  a 


If  e  =  0,  the  second  line  is  vacuously  good.  Next,  if  g  >  e  >  0,  then  let  z  =  g/e;  since 
0e  =  g/{m(d)x),  and  since 

_g_  (l-a)(l+2a') 

m(d)  1  —  3a  +  aa 

by  Equation  (3.10),  the  second  line  is  good  if 


.(I-  ■>)(!+  W)  X 

1  —  3a  +  aa  %’  -  1  1 


for  z  >  1, 


which  is  true  by  Lemma  3.11(e),  assuming 

^  1-Sa  +  a*  1 
P“  a(l-a)  1  +  2a'* 


(3.27) 


Finally,  if  e  >  g,  then  since  g  >  af  and  e  <  ct(l  —  a)//(  1  +  o')  by  balance  and 
Lemma  3.5(a),  the  second  line  is  good  because 


Case  7.  This  case  is  paid  for  by  Case  3. 
This  completes  the  proof. 


3.7.  The  SPLIT  operation. 

The  split  operation  takes  a  tree  5  and  a  key  K,  and  returns  an  item  node  /  storing 
the  item  of  K,  as  well  as  two  trees  L  and  R  storing  the  left-  and  right-items  of  K.  The 
trees  may  be  null  if  the  corresponding  set  of  items  is  empty.  For  simplicity  we  assume  that 
S  contains  the  item  of  K  at  rank  Jb;  a  simple  change  to  Case  3  of  the  algorithm  allows  the 
split  operation  to  work  when  K  is  not  in  S,  but  its  running  time  becomes  more  difficult 
to  analyte  due  to  the  uncertainty  about  the  length  of  the  path  to  the  “missing”  key.  See 
Section  1.2,  where  this  problem  also  arises  with  the  insert  operation. 

The  split  operation  works  as  follows: 
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Figure  8.18. 

A  biased  weight-balanced  tree  about  to  be  split. 


■*  ©HO 


Figure  3.14. 

Case  1,  ft  is  an  item  node  containing  the  search  key. 


Algorithm,  split  S  at  K. 


Input:  A  tree  S  and  a  key  K. 

Preconditions:  S  contains  the  item  of  K.  W(S)  =  W,  W[K)  —  to. 

Output:  A  tree  L  (or  null),  an  item  node  I,  and  a  tree  R  (or  null). 

Postconditions:  L  stores  the  left-items  of  K,  I  stores  the  item  of  K,  and  R  stores  the 
right-items  of  K.  There  are  no  subitem  nodes  under  /. 

There  are  three  cases;  the  algorithm  uses  the  one  that  applies.  Figure  3.13  shows  a  biased 
weight-balanced  tree  about  to  be  split. 

Case  i.  (5  is  an  item  node  whose  key  is  K.\  Detach  the  subtrees  a  and  c  from  ft,  set  L  *—  a, 
I «-  b,  and  R  «-  e,  and  return.  See  Figure  3.14. 

Case  2.  (ft  is  a  non-item  node.]  The  search  key  K  occurs  in  one  of  the  two  subtrees  a 
and  c;  assume  that  it  occurs  in  e  (the  other  case  is  symmetric).  Recursively  split 
the  subtree  e  at  K  to  form  L',  /,  and  R ;  join  a  and  V  to  form  L;  and  return. 
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Figure  3.15. 

Case  2,  b  is  a  non-item  node. 


□  a  ,  ®0® 

Figure  8.18. 

Case  3,  6  is  an  item  node. 


See  Figure  3.15. 

Case  3.  [5  is  an  item  node  whose  key  is  not  K-]  The  search  key  K  occurs  in  one  of  the 
two  subtrees  o  and  e ;  assume  that  it  occurs  in  c  (the  other  case  is  symmetric). 
Detach  e  from  b  and  recursively  split  c  to  form  V,  /,  and  R;  join  b  and  L'  to 
form  L\  and  return.  See  Figure  3.16. 

Proposition  8.13.  The  split  algorithm  is  correct. 

Proof.  Immediate  from  the  correctness  of  the  join  algorithm. 


8.8.  Analysis  of  the  SPLIT  algorithm. 

As  usual,  we  account  for  time  spent  in  the  split  algorithm  with  poker  chips.  Let 


X  =  max(3flogp  +  2,3flog„  i)) 

be  a  constant  (depending  on  a  and  p).  Then  let 


7, 

(2  +  3X)flogy  *1  +  (5  -  3X), 


if  *  =  1; 
if  *  >  1. 


(3.28) 


(3.29) 
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1  <  *  <  p1 

j/*  <  *  <  /* 

ft*  <  *  <  P'4 


9  +  3X 


11  +  6X 


13  +  9X 


Table  3.2. 

Some  values  of  a. 

This  is  the  running-time  function;  that  is,  a(x)  is  the  number  of  chips  the  cashier  allots  to 
split  a  tree  of  weight  W  at  a  node  of  weight  to,  where  the  weight  ratio  W/w  is  x.  The 
constant  p'  is  some  number  greater  than  1  whose  value  will  be  determined  in  Section  3.9. 
Table  3.2  gives  the  first  few  values  of  a. 

Lemma  3.14.  The  function  a  is  monotonicaliy  increasing.  Furthermore, 
o(qx)  >  a(x)  +  (2  +  3X),  if  p'  <  q  and  *  >  1. 


Proof.  Immediate  from  Equation  (3.29). 

After  we  split  a  biased  2-3  tree,  we  left  the  resulting  trees  cast  to  the  rank  of  the 
original  tree;  the  chips  in  the  cast  helped  pay  to  join  the  trees  to  other  trees  in  the 
forest.  The  analogous  idea  for  biased  weight-balanced  trees  is  to  leave  the  resulting  trees 
completely  cast  (so  that  their  subitem  nodes  have  chips),  and  to  add  enough  chips  to  the 
roots  to  satisfy  the  following  invariant. 

8PL1T  Invariant.  After  a  tree  of  weight  W  is  split  at  an  item  with  weight  to,  yielding 
two  trees  L  and  R  with  weights  l  and  r,  there  will  be  r(W/l)  chips  on  the  root  of  L  and 
r(W/r)  chips  on  the  root  of  R. 

Here  r  is  the  function  describing  the  running  time  of  the  join  algorithm,  defined  in 
Equation  (3.19).  We  will  need  two  more  simple  facts  about  r. 
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Lemma  S.15.  If  z  and  y  are  at  least  1,  then 


and 


T(*l0  <  r(s)  +  r(y)  +  X  -  3, 


r(x/a)  <  t(z)  +  X. 


(3.30) 


(3.31) 


Proof.  The  proof  uses  an  even  simpler  fact,  namely  fa +  6]  <  fa]  +  ffc].  If  z,y  <  (1  — a)/a, 
then  r(z)  =  r(y)  =»  1,  so 

r(*v)  <  r((i^)*)  =  3flog,  +  1  <  X  -  1  =  r(z)  +  r(y)  +  X  -  3; 
if  z  <  (1  —  a)/a  <  y,  then  r(z)  =  1,  so 

r(zy)  =  3flog,  +  log,  z]  +  1  <  r(y)  +  (X  -  2)  +  (r(z)  -  1); 
and  if  z,y  >  (1  -  a)/o,  then 

r(zy)  -  s[log,  +  log,  +  log,  ^ |  +  1  <  r(z)  +  (r(y)  -  1)  +  (X  -  2). 

This  proves  Equation  (3.30).  Finally, 

t(z/o)  *  sjtog,  — -  +  log,  +  1  <  r(z)  +  X. 


Theorem  3.16.  The  split  algorithm  uses  o(Wlw)  chips  to  split  S  at  K ,  and  to  leave 
iU  result  satisfying  both  the  Chip  and  Split  Invariants,  assuming  that  its  input  satisfied 
the  Chip  Invariant 

Proof.  By  normalising,  we  may  assume  that  W  =  ft  =  1  (see  Figure  3.13).  As  usual,  we 
will  go  through  the  algorithm  case  by  case,  deriving  chip  tables  and  proving  that  they  are 
good. 


Case  1. 


Given 

9(1/10)  from  cashier 
X(0a)  on  a 

X{0c)  on  c 
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Needed 
1  overhead 
r(l/a)  to  cast  a 

r(l/c)  to  cast  e 


(See  Figure  3.14.)  Let  z  =  l/w  =  l/m(b).  Now  t(1/o)  <  x(<*)+3  <  x(a*)+3  =  x(0a)+3 
by  Lemma  3.11(d);  similarly  r(l/c)  <  x(Pc)  +  3,  so  the  table  is  good  because  a(x)  >  7  for 
z  >  1. 


Case  2. 


Given 

0(1/10)  from  cashier 
t(c/Z;)  on  V 

r(c/r)  on  R 


Needed 


0(0/10) 
r(a  ,/') 
r(l/(a  + 1')) 

T(l/r) 


overhead 
recursive  call 
JOIN 

to  cast  L 
to  cast  R 


(See  Figure  3.15.)  Let  z  =  c/w.  Since  e  >  a,  by  Equation  (3.31)  we  have  r(l/r)  < 
r(c/r)  +  X.  Also  t(1/(o  +  l’))  <  r(l fa)  <  r(l/a)  <  1  +  X,  by  Equation  (3.31).  Now  if 
a  >  l1,  then  by  balance  and  Equation  (3.31)  we  have 


r(c/n  <  T(~“"  ^7)  <  r(~)  <  r(a//')  +  X; 


and  if  V  >  a  then  r(c//#)  =  r(o/Z')  =  1  because  a  <  V  <  c  implies  that  e/l'  and  V /a  are 
both  less  than  c/a  =  pc  <  (1  —  a)/ a.  By  these  facts  and  the  fact  that  c  <  1  —  a,  the  table 
is  good  if 

0(j^*)  >  *(z)  +  2  +  3X, 
which  is  true  by  Lemma  3.14,  assuming 
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Cage  3. 

Given  Needed 


<7(1/1®) 

from  cashier 

1 

overhead 

x[Pe) 

on  e 

<r(e/u>) 

recursive  call 

r{e/n 

on  V 

r(l-c,n 

JOIN 

r{e/r) 

on  R 

r(l/(l-e  +  f')) 

to  cast  L 

r(l/r) 

to  cast  R 

(See  Figure  3.16.)  Let  *  =  c/w.  As  in  Case  2,  r(l/r)  <  r(e/r)  +  X.  Since  c  <  o'/(l  +  <*0 
by  Lemma  3.5(a),  r(l/(l  —  e  +  l1))  <  t(1/(1  —  c))  <  t(1  +  a')  =  1,  assuming 


Now  assuming 


1  +  a'  < 


1  —  a 
a 


1  +  a1 


(3.33) 

(3.34) 


then  c  <  1/2,  so  1  —c>c>l'  and  the  join  has  U  as  the  smaller  tree.  But  now 


.1-c.  .1 

<—n~)  =  T( 


c  m(b) 


< 

<  r( 


m(6)  /' 

:i+« 

1  m(6) 


) 


<  r((l  +  *')^) 


) 


a  V 
m(5)c. 
c  + 


<  X(0c) +>(£)  +  2X 


by  Lemma  3.5(d),  Equation  (3.33),  Equation  (3.31),  Equation  (3.30),  and  Lemma  3.11(c). 
Thus  the  table  is  good  if 

<r( 1  -77—  x)  >  0(1)  +  3X  +  2, 

Or 

which  is  true  by  Lemma  3.14,  assuming 

pf  <  (3.35) 


This  completes  the  proof. 
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3.9.  Tuning  the  algorithm*. 


Recall  the  running  times  for  the  various  operations: 

access  (worst  case):  lg  —  +  0(1), 

lg/c  to 


ACCESS  (average  case): 


Ha  lg  to  + 

r^-lg-  +0(1), 
lgp  ti> 

2  +  3X  ,  W  „,.v 

TIF“v  +  fl,1)' 


where 


k  =  min(  ~~7~)  and  X  =  max(3pogp  ^-^1  +  2, 3flogp  i)) 


1  -  a. 
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1  -2a 
a(l  -  a) 
o' 


(14) 

2  —  4a  +  a*  >  1 

(15) 

P  < 

1  +  a' 
o' 

1  -2a  1 

> 

1  —  a 

(16) 

P  < 

a  1  +  af 

1  —  a 

>  1  +  a' 

(17) 

P  < 

1  —  2a 

(18) 

VI 

s 

-•  1 
wH 

1  —  a 

a 

“  l-2a 

(19) 

P  < 

1  -  2a  1 

a  1  +  2  at 

1  -  3a  +  a2  1 

2a  —  a2  > 


1  -  3a  +  a3 


~  1  —  a 
„  a 


a(l  —  a)  +  0/(1  —  a  —  a2)  —  1  —  a 


a  l  +  2a> 
1  -  3a  +  a2  1 
a(l  -  a)  1  +  20* 


a(l-a)  (22) 

“  1  -  2a 

1  —  a 

^  a(l  —  a)  (23) 

1  +  a'  <  — - 
a 

“  1  -  3a  +  a* 

A  \ 

a'  .  1 

1  —  3a  +  a9  >  — ^ — 
“  1  —  a 

1  —  3a  +  a3  >  a 
(1  -  2a)(2  -  a)  >  1  -  a 


1  +  a*  ~  2 

,  ^  1  +  a' 


Table  3.3. 

Relations  among  a,  a',  p,  and  ft . 


We  should  choose  the  parameters  a,at,p ,  and  ft  to  minimise  the  leading  coefficients, 
subject  to  the  constraints  given  by  Equations  (3.3),  (3.4),  (3.5),  (3.6),  (3.7),  (3.8),  (3.0), 
(3.11),  (3.12),  (3.13),  (3.14),  (3.15),  (3.16),  (3.17),  (3.21),  (3.22),  (3.23),  (3.24),  (3.25),  (3.26), 
(3.27),  (3.32),  (3.33),  (3.34),  and  (3.35).  These  equations  are  collected  in  Table  3.3. 

In  Table  3.4  we  have  simplified  the  inequalities  or  Table  3.3  and  set  a!  =  a/(l  —  3a) 
in  the  inequailities  involving  p  and  ft.  This  will  be  justified  very  soon. 

In  light  of  (1),  it  is  tediously  checked  that  inequalities  (6),  (7),  (12),  (13),  (14),  and  (18) 
are  weaker  than  (11).  This  means  we  are  required  to  choose  a  <  ao,  where  ao  =  0.24512+ 
is  the  root  of  1  —  5a  4-  4a2  —  a*.  Assuming  this,  we  check  that  (2),  (4),  (8),  (9),  and  (23) 
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(1) 

(2) 

(3) 

(4) 

(5) 

(«) 

(7) 

(8) 

(») 

(10) 

(11) 

(12) 

(13) 


1  —  4a  +  2a8 
2  -  3a  +  a8 


1  —  5a  +  4a8  —  a3 
1  —  4a  +  a* 
1  —  4a  +  2a8 


< 

< 

> 

< 

> 

> 

> 

< 

< 

> 

> 

> 

> 


1 

2 

1  -2a 

a 

a 

1  -2a 
1  -  3a  +  a8 

°(1  ~  «) 
a 

1  —  3a 
0 
0 

1  -  4a  +  3a8 
a(l  —  a  —  aa) 
1  —  3a  +  a8 
ofl 

«(1  -  «) 

1  -  3a  +  a* 

0 

0 

0 


(14) 

(15) 

(16) 

(17) 

(18) 

(19) 

(20) 
(21) 
(22) 

(23) 

(24) 

(25) 


1  —  4a  +  a8  > 
P  < 
P  < 
P  < 

1  -  3a  +  a*  > 
P  < 

P  < 

P  < 

p'< 

a'  < 
a'  < 


0 

1  —  2a 
a 

1  —  3a 
a 

1  —  a 
1  —  2a 
0 

1  -  5a  +  6a8 
a(l  -  a) 

1  -  6a  +  10a8  -  3a8 
a(l  -  a) 

1  -  6a  +  10a8  -  3a3 
a(l  -  a)2 

1 

1  —  a 
1  —  2a 

a 

1 

1  —  2a 
a 


Table  3.4. 

Simplified  relations  among  a,  a',  p,  and  pf. 


are  weaker  than  (24),  and  that  (3)  and  (10)  are  weaker  than  (5). 

It  is  always  advantageous  to  choose  a '  as  small  as  possible  —  this  discourages  subitem 
nodes  and  allows  more  liberal  choices  oC  p  and  pf.  Thus  we  should  set  a'  =  a/(  1  —  3a). 
Now  we  check  that  (15),  (16),  (19),  and  (21)  are  weaker  than  (20),  and  that  (25)  is  weaker 
than  (22).  Hence  (22)  governs  the  choice  of  pf  and  either  (17)  or  (20)  governs  the  choice  of 
p.  Clearly  we  should  set  p'  =  1/(1  —  a).  For  p,  we  check  that  (20)  is  weaker  for  (17)  for 
0  <  a  <  at,  where  aj  =  .18983+  is  the  smaller  root  of  1  -  9a  +  24a8  —  24a3  +  6a4,  but 
that  (17)  is  weaker  than  (20)  for  a*  <  a  <  ao.  So  to  choose  p,  we  compare  atoai  and 
use  (17)  or  (20)  accordingly. 

In  order  for  the  running  time  analyses  to  be  valid,  we  must  also  have  p  and  jt  greater 
than  1.  The  right-hand  sides  of  (17)  and  (22)  always  exceed  1,  but  the  right-hand  side  of  (20) 
does  so  only  for  a  <  a%,  where  a%  —  .20550“  is  the  smallest  root  of  1  —  7a  +  11a8  —  3a3. 
Note  that  as  <  ao,  so  this  further  restricts  the  feasible  range  for  a.  (It  is  curious  that  for 
a*  <  a  <  ao  the  algorithm  works  correctly,  but  not  provably  efficiently,  at  least  not  by 
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this  technique.) 


To  summarise,  all  the  results  in  this  chapter  are  valid  if  we  choose 


a  <  aa, 


a  = 


1-3 a’ 
1 

1  —  a* 
l  —  o 


and  p 


1  —  2a* 

1  -  6o  +  10a*  -  3o* 
o(l -a) 


if  0  <  o  <  at, 
if  Oi  <  a  <  as, 


where 


oj  =  0.18983  satisfies  1  —  9o  +  24o*  —  24o*  +  6a4, 
and  aa  =  0.20550”  satisfies  1  —  7a  +  11a*  —  3a*. 


We  get  the  absolute  maximum  value  for  p  by  setting  a  =  ai;  by  increasing  a,  we  increase 
the  access  time  (since  *  and  H„  both  increase)  at  the  expense  of  the  join  time  (since  p 
decreases).  These  formulas  can  by  used  to  determine  a  compromise  setting  for  a,  assuming 
a  decision  about  the  relative  frequency  of  access  and  join  operations.  Some  sample  values 
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I 

1 

1 

3 

2  +  3X 

a 

P 

lg  Ha 

IgP 

Ig  * 

0.05882+ 

1.0556" 

1.0526+ 

13.513+ 

3.4917" 

38.460+ 

6837.8" 

• 

0.14286" 

1.125 

1.1111+ 

6.5788+ 

2.1322+ 

17.655" 

1197.3+ 

na 

1.2143" 

1.1765" 

4.2650+ 

1.6398" 

10.710+ 

392.38+ 

0.39130+ 

1.2812+ 

1.2195+ 

3.4928" 

1.4704+ 

8.3904" 

248.99" 

0.44094" 

1.3060+ 

1.2343+ 

3.2927+ 

1.4263+ 

7.7889+ 

214.03" 

0.44188+ 

1.3023" 

1.2346" 

3.2894" 

1.4256" 

7.8729" 

213.81+ 

0.5 

1.1 

1.25 

3.1063" 

1.3852" 

21.818" 

481.47+ 

0.53247" 

1.0088" 

1.2579" 

3.0214" 

1.3665" 

238.09+ 

4955.1" 

Table  3.5. 

Sample  values  for  the  parameters. 


in  Table  3.5. 


Chapter  4 

Conclusions  and  Future  Directions 


Biased  2-3  trees  and  biased  weight- balanced  trees  are  two  examples  of  trees  that  imple¬ 
ment  dynamic  dictionaries  with  logarithmic  performance  in  the  worst  case  for  access  and 
pind  operations,  and  with  logarithmic  performance  for  all  dynamic  operations,  provided 
the  cost  is  amortised  over  a  sequence;  the  only  technical  exception  is  that  the  join  operation 
for  biased  2-3  trees  runs  in  time  proportional  to  the  difference  in  rank  between  two  trees 
instead  of  the  logarithm  of  the  ratio  of  their  weights.  Even  this  “problem”  (which  we  saw 
was  not  crucial  due  to  the  telescoping  property  of  rank  differences),  biased  2-3  trees  may 
be  preferable  to  biased  weight-  balanced  trees  because  the  algorithms  are  simpler  and  be¬ 
cause  internal  nodes  need  only  store  integer  ranks,  rather  than  real- number  weights.  Both 
trees  are  powerful  enough  to  support  the  network  Bow  and  self-organising  data  structure 
applications. 

A  number  of  questions  involving  dynamic  weighted  data  structures  were  suggested  but 
not  answered  in  tills  thesis.  We  list  a  few  here. 

1.  Special  operations.  The  dynamic  operations  delete,  promote,  demote,  and  insert 
were  Implemented  in  terms  of  join  and  split.  Naturally,  it  should  be  possible  to  reduce 
the  running  time  of  these  operations  by  writing  direct  algorithms.  For  instance,  there  is 
no  reason  to  do  a  complete  split  at  some  node,  essentially  pulling  it  up  to  the  root  of  the 
tree,  if  the  weight  added  to  it  by  promote  would  merely  cause  it  to  move  up  a  shorter 
distance.  Especially  if  promote  and  demote  are  only  used  with  small  values  of  6  (as 
in  the  self-orgainsing  data  structure  application,  where  6  =  1),  a  direct  algorithm  should 
achieve  substantial  savings  over  an  indirect  one. 
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2.  Discretization.  Biased  weight-balanced  trees  achieve  true  logarithmic  performance,  but 
they  manipulate  real  numbers  (weights)  at  every  node.  For  practical  purposes  it  would  be 
desirable  to  make  most  of  the  arithmetic  involve  only  integers,  except  for  the  conversion 
necessary  from  the  weights  of  the  items  themselves.  This  was  the  case  for  biased  2-3  trees, 
but  had  the  consequence  that  the  running  time  of  the  join  operation  depended  on  the 
rank  difference  and  not  on  the  weight  ratio.  The  two  measures  are  related,  but  not  closely 
enough,  because  the  upper  and  lower  bounds  on  the  weight  of  a  node  with  a  given  rank  do 
not  match  within  a  constant  factor. 

It  would  be  interesting  to  find  a  data  structure  that  could  close  the  gap  between  these 
bounds  without  manipulating  real  numbers.  For  instance,  by  limiting  the  number  of  3- 
nodes  appearing  near  each  other  it  may  be  possible  to  avoid  the  bad  case  of  a  tree  with 
rank  r  having  weight  about  3r. 

3.  Other  trees.  The  general  approach  of  both  the  data  structures  described  here  is  to  allow 
heavy  item  nodes  to  appear  high  in  the  tree,  but  to  leave  chips  on  the  light  nodes  near  it  to 
pay  for  possible  later  operations  involving  these  nodes.  The  chips  make  up  for  the  fact  that 
the  light  nodes  really  belong  farther  down  in  the  tree;  they  can  .be  thought  of  as  counting 
the  number  of  dummy  nodes  it  would  be  necessary  to  introduce  to  form  a  path  down  to 
the  level  where  the  light  nodes  belong. 

This  idea  is  probably  applicable  to  a  wide  variety  of  trees.  It  would  be  instructive  to 
carry  out  the  application  for  many  kinds  of  trees,  not  only  to  have  a  large  repertoire  of 
dynamic  weighted  data  structures,  but  also  to  determine  how  the  structure  constraints  of 
various  unweighted  structures  interact  with  the  balance  constraints  of  weighted  ones.  They 
do  not  always  merge  well;  for  instance  it  is  not  completely  obvious  how  to  make  a  biased 
B-tree  with  large  branching  factor.  One  problem  is  illustrated  in  Figure  4.1.  An  item 
node  of  rank  6  is  surrounded  by  minor  nodes  in  a  biased  3-6  tree  (each  internal  node  except 
the  root  has  between  3  and  6  children).  If  we  join  another  tree  with  rank  4,  there  are  now 
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seven  nodes  with  rank  6  or  less,  but  we  cannot  make  two  trees  with  rank  7  from  them.  On 
the  one  hand  we  ought  to  combine  some  of  the  smaller  nodes  together  into  one  subtree  of 
rank  5,  but  on  the  other  hand  we  cannot  do  that  too  often  or  we  would  reduce  the  number 
of  children  too ‘much. 

Nevertheless,  it  seems  reasonable  to  expect  that  "biased”  versions  of  B-trees  and  RB- 
trees  exist.  The  balance  conditions  will  just  need  to  be  subtler.  If  it  is  possible  to  exhibit 
a  large  repertoire  of  good  structures,  it  would  be  interesting  to  examine  general  condi¬ 
tions  under  which  an  unweighted  structure  has  a  weighted  counterpart.  These  conditions 
might  mention  the  flexibility  of  substructures  in  the  host  structure,  or  the  distribution  of 
information-bearing  and  bookkeping  nodes,  or  other  properties. 

4.  The  insert  problem.  It  is  difficult  to  insert  into  a  biased  2-3  tree  due  to  the  uncertainty 
of  how  far  down  in  the  tree  one  has  to  look  in  order  to  find  the  correct  place  to  insert  the 
new  item  (see  Section  1.2).  Two  different  approaches  might  alleviate  this  problem. 

First,  if  the  probability  that  an  INSERT  will  occur  in  each  interval  between  currently 
present  keys  is  known  in  advance,  items  corresponding  to  these  gaps  could  be  stored  along 
with  the  real  items,  and  given  appropriate  weights. 

Second,  if  the  search  for  the  gap  goes  too  far  down  in  the  tree,  it  might  be  possible  to 
stop  prematurely  and  insert  the  item,  leaving  a  note  that  the  insert  never  finished.  Then 
a  later  operation  which  had  occasion  to  go  farther  down  the  tree  could  carry  the  item  along 
with  it.  Of  course  we  would  leave  chips  along  with  the  note  to  pay  for  carrying  on  the 
insert.  Two  problems  with  this  are  how  to  keep  track  of  many  items  waiting  for  their 
inserts  to  continue  through  the  same  node,  and  how  to  split  at  an  incompletely  iNSBRred 
node. 

5.  Amortisation.  Of  course,  the  main  question  is  whether  dynamic  weighted  data  structures 
exist  which  do  not  require  amortisation,  and  which  do  not  hide  it  in  some  other  way  (such 
as  creating  a  lot  of  dummy  nodes  to  intervene  between  light  nodes  and  heavy  item  nodes.) 
Kriegel  and  others  have  recently  worked  on  this  problem  [15,  16,  27). 

But  actually  there  is  a  sense  in  which  amortisation  should  not  be  considered  a  problem 
with  this  approach,  but  rather  a  feature.  After  all,  if  a  light  node  joins  the  tree  at  less 
cost  than  its  fair  share,  why  should  we  pay  more?  We  may  eventually  have  to  pay  the 
remainder,  but  for  now  why  not  save  our  chips  whenever  possible.  Only  if  the  worst-case 
response  of  each  operation  is  more  important  than  the  overall  response  of  all  operations 
together  should  ere  be  searching  for  an  unamortised  solution  to  the  dynamic  weighted  data 
structure  problem. 
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